Course No. 6 "Genome Browser"

 

Problem 1

We have the following goals in analyzing human genomic sequence:

  • Locate a human gene
  • Download the gene sequence along with its upstream sequence (to analyze promoter regions)
  • Download the cDNA sequence and its corresponding amino acid translation
  • Identify Single Nucleotide Polymorphisms (SNPs) and conserved regions in the gene
  • Identify genes in the neighborhood and homologs in other organisms
  • Find identified alternative splice variants
  • Use browsers to download results in a batch
     

UCSC genome browser

Retrieve the UCSC Genome Bioinformatics web page (http://genome.ucsc.edu/) and click on the “Genome Browser” link.  Click on the "Click here to reset” link to reset the browser user interface settings to their defaults.  Click first on the "configure tracks and display" button. From Configure Tracks on UCSC Genome Browser select the "hide all" Tracks button and click the submit button. 

To access the OTC gene region and download its genomic sequence: Type OTC in the position/search box and select the go button.  Click on the first OTC gene link under the RefSeq Genes heading. Under Genes and Gene Prediction Tracks, click on the “UCSC Genes” link.  From the Display mode, select pack from the pull-down menu. From Label, check the box for “gene symbol,” and select the Submit button.  The gene is placed on the forward strand as seen from the direction of arrows on the gene model. From the TrackMap, click on the OTC gene symbol label to the left of the UCSC Genes track.  Under Sequence and Links to Tools and Databases, select the “Genomic Sequence” from assembly link to download the genomic sequence and/or the upstream and downstream sequence.

To view SNPs annotated on the region: Go back to the Genome Browser OTC display page.  Under Variation, click on the Common SNPs (147) link. At the top, change the Display mode option to “pack.” Under Coloring Options, change “Splice Site” from Red to Blue and click on the Submit button.

To download SNPs in the conserved region: Click on the Tools—Table Browser link at the top of the page. From Group, select "Variation". From Track, select Common SNPs (147) link. From Region, select the "position" radio button.  Next to Intersection, click the "create" button. From Group, select “Comparative Genomics.” From Track, select “Conservation.” Click on the submit button.  From Output Format, select “BED-browser extensible data” and click on the “get output” button.  Adjust the upstream/downstream base location, if necessary, and click on the get BED button.

 

MapViewer

Access NCBI's MapViewer (http://www.ncbi.nlm.nih.gov/mapview/) resource.  Select Homo sapiens Annotation Release 108.  Type “OTC” in the search box and click on the Find button.  Select Gene from the Quick Filter box on the right and click on the Filter button. You will see search results. The first is the reference assembly. Under Maps, click on the “Genes_seq” link for the reference assembly.

Identify homologs in other organisms: Under Maps, click on the “Genes_seq” link for the reference assembly. Click on the hm link to access the HomoloGene page for the OTC gene. 

 

Ensembl

Access the Ensembl genome browser page (http://www.ensembl.org/index.html). Select Human genome under the Popular genomes heading.  Select Gene from the Search all categories pull down menu.  Type "OTC" in the search box and click on the Go button. Click on the OTC link.  Click on the Show transcript table link if necessary.

To obtain the cDNA sequence with amino acid translation:  Click on the first transcript id ENST00000039007. From the Transcript-based displays menu (left-side of screen), select the “cDNA” link under the Sequence heading.  You can use “Configure this page” link (below the Transcript-based displays Menu) to change the display settings.  The pop-up window can be closed by clicking the checkmark in the upper right.

To download the upstream sequences for multiple human genes in a batch mode: Select the Biomart link at the top of the page. Select "Ensembl Genes 87" and "Homo sapiens genes "(GRCh38.p7)" as a dataset. Click on the Filters menu. Open the GENE criteria by clicking on the + sign. Check the  Input external references ID list box and select the id type as “HGNC symbol” and type the gene symbols such as HFE, HRAS, OTC in the box below. Click on the Count button at the top to check how many genes Biomart can find.  Click on the Attributes option and select the Sequences radio button. Open the Sequences menu by clicking on the + sign, Under the Sequences heading, select the Flank (gene) radio button. Select the Upstream flank check box and enter the number of nucleotides desired, such as 1000. Open the Header Information menu by clicking on the + sign.  Select options as desired such as Associated Gene Name.   Retrieve the sequences by clicking on the “Results” button at the top.


 

 

Questions, Comments:  Medha Bhagwat, PhD