ors.od.nih.gov logo and link to the ors web page

Course No. 6 "Identification of Disease Genes"

 

Problem 2

A laboratory has generated an EST library from a sickle cell anemia patient and wants to identify the gene(s) causing the phenotype. Sickle cell anemia is a disease in which the red blood cells are curved in shape, and which causes pain and fever.

Outline:

We will follow these steps to solve the problem:

1. Compare
ESTs from a sickle cell anemia patient to the human genome (using BLAST).
2. Identify the gene(s) aligning the ESTs and download their sequences (using
Map Viewer).
3. Identify whether the ESTs contain any known nucleotide variations (single nucleotide polymorphisms) (using
dbSNP).
4. Determine whether a mutant form of the gene is known to cause a phenotype (using
OMIM).

Step 1. Compare ESTs to the human genome (using BLAST):

One way to identify the genes expressing the ESTs is to compare their sequences using BLAST with the human genome assembly and the genes annotated on it. To access the specialized BLAST page for searching against the human genome assembly, click on

BLAST (human genome)

Paste the EST sequence provided below in the query box of the BLAST page, select the "genome (reference only)" as the database and start the search by clicking on the “Begin Search” button.

Query EST Sequence:

ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGTGGAGAAGTCTG
CCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCC
TTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGT

How many BLAST hits are obtained? Name the chromosomes and contigs that we get as a BLAST hits. Note that the similarity is on the minus strand of genome. Since there are multiple BLAST hits, it is much easier to view them first in the MapViewer.

Step 2. Identify the gene(s) expressing the ESTs and download their sequences:

To visualize the BLAST hits on the genome using Map Viewer, click on the "Genome View" button at the top of the results page, then on the chromosome "11" or the contig link. Currently, 4 maps should be displayed (Model, RNA, Gene_seq and Contig). Zoom out several times by clicking on rightmost map and selecting the appropriate option till you can visualize the entire HBB and HBD genes.

The BLAST hits are indicated by the red bars, and the best two matches are in the region of two exons of the HBB gene annotated on the human genome. Note the nucleotide number  and the difference in the sequence between the EST and the genomic sequence by clicking on the "Blast hit" link with 99% identity.

Go back to the Map Viewer report by clicking the "back" browser button once. Make the Gene_seq map a master map by clicking on the arrow at the top of the map. Since the gene of interest for further analysis is the HBB gene, remove the HBD gene from the view by clicking on the line near the HBB gene and selecting the "Recenter" option. Note that the gene is annotated on the minus strand (up arrow). To display the entire HBB gene sequence, click on the "dl" link, choose minus strand from the pull down menu, click on "Change Region/Strand" and display the sequence by clicking on "Display". Copy the sequence and paste it in the area provided below.
If you want to include upstream or downstream sequence, you can adjust the nucleotide locations to download by using the "adjust by" and "Change Region/Strand" options.

HBB gene sequence

Step 3. Determine whether the ESTs contain known SNPs:

Go back to the Map Viewer report. Click on the Maps and Options link. Remove all the maps except the Gene_seq map by clicking on the – sign next to the map listed under the Tracks Displayed menu. Access the variation map by scrolling the Sequence Maps list and add that map clicking on the + sign next to it.  Use Drag to reorder, if necessary to make the variation map as the master map. Click on the OK button.

Now two maps are displayed, Variation (it's the rightmost and the master map) and Gene_seq. The master map provides detailed information for the map features, in this case SNPs. Zoom in on the blast hit area (red bar). There are numerous SNPs in the area, one of them is rs334. (Around position 5248232 – you may need to zoom in a lot and it may help to return to the Maps and Options window, select Variation, and click Toggle Ruler.  An alternative is to get a list of all SNPs by clicking on the Data as Table View link on the left side of the page.)

Click on the link for the SNP. There is an A/T SNP at the nucleotide position 5248232 on NC_000011.9 as shown in the Integrated Maps and GeneView sections. Is this the same nucleotide variation found in the BLAST result in Step 1? What is the resulting change in the amino acid?

Step 4. Determine whether the mutant HBB gene causes a phenotype:

Click on the OMIM link in the SNP report. It takes us to the OMIM report for the HBB gene that details how mutations in the HBB gene are associated with a phenotype, sickle cell anemia. As mentioned in the Note Regarding the Allelic Variants Section, the allelic variants are listed for the mature HBB protein which lacks initiator methionine. Note the hemoglobin variant name and the phenotype caused by the Glu6Val variant. 

 

Questions, Comments:  Medha Bhagwat, PhD