Making Sense of DNA and Protein Sequences
Notebook 1


Electronic Notebook for Protein Sequence Analysis

Links to glossary terms are noted by the G icon for links to glossary terms icon. Definitions appear in a popup window.

Start here with your DNA sequence

Initial DNA Sequence


To identify any exonsG icon for links to glossary terms in the DNA sequence and generate a predicted protein sequence, click here:


Paste your DNA sequence into the GenScan input window. Press the "Run Genscan" button. Select the protein with the highest exon P-values (This probability, P(E), is defined as the sum of the probabilities under the model of all possible "parses" (gene structure descriptions) which contain the exact exon E in the correct reading frame.) and paste this FASTAG icon for links to glossary terms formatted output into your notebook.

Protein Sequence from Genscan

Cheat Now!

To scan the protein sequence for the occurrence of motifs/patternsG icon for links to glossary terms found in the PROSITE database, use:


Paste the protein sequence from GenScan into the ScanProsite "Sequence(s) to be scanned" input box and press the "Start the Scan" button. Paste the ScanProsite hit into your notebook. To see the Prosite summary for the hit, click on the PSxxxx number.

Hit from ScanProsite

Prosite pattern

Cheat Now!

To search for proteins with similar sequences and conserved domains, use BLAST:

BLAST Logo and link to blast web page

Run a BLASTp search against the Swiss-Prot database by pasting the protein sequence from GenScan into the input box on the BLASTp page. Choose the SwissProt database from the database listbox, then press the "BLAST" button. Format your results by selecting the "Formatting options" link on the results page and selecting alignment view as "Flat-query anchored with dots for identities" and clicking on the "Refomat" button. Paste this alignment into your notebook.


BLASTP Alignment (against SwissProt)

Cheat Now!

Conserved Domain

CDD Logo

BLASTp Search also allows you to match your protein sequence to conserved protein domains in the Conserved Domain Database, generate a multiple sequence alignment based on this match, and explore 3D modeling templates for your sequence. Click on the Conserved domain displayed as "Specific hits" at the top of the BLASTp results page and again on the "Specific Hits" in the next page.

Conserved Domain Description

Cheat Now!

 To obtain 3D Structure Template

3D Structure logo

To align your query protein to a similar sequence from a 3D structure, run the Protein BLAST (blastp) search and choose Search Set: Protein Data Bank proteins (pdb).  Format your results by selecting the "Formatting options" link on the results page and selecting alignment view as "pairwise" and clicking on the "Reformat" button.  Click on the description link of the hit interest.  Click on the "structure" link on the right hand side of the alignment.   (You may choose to view the information about ligands by clicking on the accession number of the structure entry.) Click on the "View Structure and Alignment in Cn3D" button. This opens an interactive view of the sequence alignment and corresponding 3D structure in NCBI's Cn3D program. Locate the Prosite Motif you found earlier within the Cn3D alignment window by using View--Find Pattern. Highlight any nearby ligand by double clicking on it in the Structure window. Use Selected--Show selected residues from the Cn3D structure window to display the highlighted residues and ligand.

3D Structure Results

Other Tools for DNA and Protein Sequence Analysis

Questions, Comments:
Medha Bhagwat, PhD