Course No. 16 "TCGA Data Analysis"

  Problem 1

The Cancer Genome Atlas Data Analysis

This course gives the background information on how to access The Cancer Genome Atlas (TCGA) data and provides an introduction to use of various online publicly available tools,
such as Firehose and cBioPortal, to analyze the data to derive biologically meaningful information.  
The course will demonstrate two approaches of analysis, cancer(s)-centric (from a cancer to the significantly mutated genes) and gene(s)-centric (from one or more genes to several cancers).  

Topics to be covered:

  • TCGA program overview

  Data types

  Data levels

  Access Tiers

  Data download options

  • TCGA Data Analysis using Firehose and cBioPortal

  Mutation details and analysis

  Obtaining significantly mutated genes and their functional/pathway analysis

  OncoPrint compact visualization of genomic alterations

  Information about gene copy number alternations from GISTIC

  Visualization of network of mutated genes

TCGA Data Overview:
Access the web page for TCGA ( Click on the Launch Data Portal link.

Click on the Projects tab at the top of the page. Note the primary sites, cancer abbreviations, and various data categories. Return to the Genomics Data Commons Data Portal, and then click on the Data tab link at the top of the page.  Note different projects, access levels, data formats, data types and experiment types.   Return to the Genomics Data Commons Data Portal again.  Click on the Head and Neck link and then on the Cases link.

Click on Cases or Files. You can filter the results by options such as Gender, Age and Vital Status on the left side of the page.  Some of this data is “open” and some is “controlled” access. 

TCGA Data Analysis Tools:
Comparison between different tools as published in Sci Signal. 2013 Apr 2;6(269):pl1 PMID 23550210

Please note that some of the features have chaged since then.

Access the GDC home page. Click on the Analysis tab.  Note links to the cBioPortal for Cancer Genomics and the Broad Firehose.

Click on the Broad Firehose link.  Then select the Broad GDAC link at the top of that page. 

Click on the Browse link under the Analyses column for the Head and Neck squamous cell carcinoma.  Click on the Mutation Analyses link.  Click on the Mutation Analyses.  Access the report by clicking on MutSig2.0. Note different kind of data analysis available.  We will take a look at the significantly mutated genes.

Click on the + sign next to significanly mutated genes.  List of all significanly mutated genes can be obtained by clicking on the Get Full Table link.  To see details of the variations on the PIK3CA gene, click on the PIK3CA gene name.  Note the muation hot spots around 540.

Go back to the main results page.  Click on Geneset Analysis link.  Access the full report by clicking on the Get Full Table link.

Access the Cbio Portal page ( either direclty or thorugh the GDC page. 

Select Head and Neck Squamous Cell Carcinoma (TCGA, Nature 2015) as the Cancer Study.  

Scroll down and type in genes TP53, CDKN2A, PIK3CA and TRAF3 on separate lines in the “Enter Gene Set” block. Click on the Submit button.  Mouse over the area above the picture on the right hand side and use the scroll bar to zoom out
Click on the Clinical Tracks and select HPV status

Note different Sort, Mutation Color, View and Download options.  

Note the differences in the type of mutations by HPV status.  Explore other tabs such as Mutual exclusivity, Plots, Mutations, Protein Changes, Network and finally the Download

Plots: Select Clinical Attribute HPV Status on the horizonatl axis and Clinical Attribute Overall Survival (Months) on the Vertical Axis

On the Mutations tab click on the PIK3CA gene.  Note the mutation hot spots at 542 and 545.

Click on the Network tab to visualize network of genes. This page could be used further for protein and drug interactions.

Finally, the Download tab can be used to download the data.


Additional information:

TCGA Program Brochure:

TCGA Data Portal Brochure:

TCGA Data Flow:

TCGA Data Classification:

Bioinformatics Program Main Page


Questions, Comments:  Medha Bhagwat, PhD