Course No. 8 "Microbial Genome Analysis"


Problem 2

We have the following goals in analyzing microbial genomic sequences:

  •     Identify organisms with complete genome sequences

  •     Obtain information about microbes such as genome size, natural environment, and pathogenicity

  •     Download information about proteins encoded by a particular genome

  •     Compare genomes of various microbes

  •     Identify genes that are present in two organisms and/or absent in a third organism

  •     Identify neighboring genes in multiple organisms that share a gene of interest

  •     Find summary information and pathway maps for microbial genes

NCBI’s Genome Database

The main page for the database is Click on the Microbes link to get to the Genome Project page for prokaryotes. Click on the Prokaryotic Projects link.  Click on Prokaryotes tab.

To access entries for a particular organism, such as Pseudomonas, use the Search by organism box at the top and type in Pseudomonas.  Click on the Pseudomonas link. Note different Pseudomonas species listed in the Overview tab.  Click on the Pseudomonas aeruginosa link. Click on the “more” link in the description.  Note the description that this bacterium is capable of degrading polycyclic aromatic hydrocarbons, and producing interesting, biologically active secondary metabolites including quinolones, rhamnolipids, lectins, hydrogen cyanide, and phenazines. 

Click on the genome Annotation Report link.  The link in the proteins column  or the See Protein Details link provides detailed information about the proteins annotated on the genome. Click on See protein details link.

Click on the Return to Genome Overview link.  Click on the Genome Project Report link to access information about Escherichia coli genomes such as its size, GC content and links to the genomes and proteins sequences. Note different genome sizes for different sub-species/strains. Click on the genome Annotation Report link.   Scroll down to the region PA7. Bioprojects column has links to tow projects, Genome sequencing and RefSeq genome.  Click on either of them.  This page gives links to genome sequence. 

Integrated Microbial Genomes (IMG)

Go to the IMG site (  Select the Find Genes tab and then click on the Phylogenetic Profilers > Single Genes option. Select the radio button for Pseudomonas aeruginosa PA7 in the first column (Find Genes In), Pseudomonas aeruginosa PAO1 in the second column (With Homologs In), and Escherichia coli str. K-12 substr. MG1655 in the third column (Without Homologs In).  Click on the “Go” button at the bottom of the page.  In the Summary Statistics table, click the COG link. On the next page, in the table of COG Categories, choose "Secondary metabolites biosynthesis, transport and catabolism". Then in the list of genes, choose protocatechuate 3,4-dioxygenase, beta subunit (GeneID 640809285).

On the Gene Details page, scroll down to the section "Evidence for Function Prediction" and click on "Show ortholog neighborhood regions". Protocatechuate 3,4-dioxygenase, β subunit, is a little red arrow near the middle of the graphic. The page shows its presence in other Pseudomonas species.

Go back to the Gene Details page. From here you can take a shortcut to MetaCyc. Go to the table of Gene Information and find the section called "Pathway Information". There are two MetaCyc pathways listed. Click on the link for "protocatechuate degradation II (ortho-cleavage pathway)".

MetaCyc and Biocyc ( and

The description of this pathway in MetaCyc shows its role in degrading aromatic hydrocarbons. (If the link from IMG isn't working, you can also go directly to MetaCyc and search for "protocatechuate", similar to Problem 1.) One can opt to seek “more detail’ (detailed structure and biochemistry of the enzymatic reactions) or ‘less detail’ (overview of the pathway).

Click ‘Species Comparison’ button to study whether this pathway is present in other species. Click on the Change Organism button at the bottom of the page, if necessary.  Select Pseudomonas aeruginosa PAO1, Pseudomonas aeruginosa PA7, and  Brucella abortus bv. 1 str. 9-941. Click on the OK button at the bottom of the page.

The results page includes a table listing color-coded information about the pathway and genes involved in this pathway. In the first row of the Operons column, click on the pcaH gene arrow to access the web page to get further details about this gene.

Since our focus is on cross species comparison, click on the ‘Align in Multi-Gene Browser’. Note that pcaH (protocatechuate 3,4-dioxygenase, β subunit) is present in all three species though the neighbors may be quite different.


Questions, Comments:  Medha Bhagwat, PhD