After more than a decade, The Cancer Genome Atlas (TCGA) program is drawing to an end. A multi-institution collaboration initiated and supported by NHGRI and the National Cancer Institute (NCI), with over $300 million in total funding, TCGA has been hugely successful in its mission to catalog the genomic changes underlying multiple cancer types.
Ambitious in scope from the beginning, TCGA was initiated in December 2005 as a three-year pilot study that focused on studying the genomic changes in three types of human cancers: brain, lung, and ovarian cancer. After a successful pilot phase, TCGA expanded the range of cancer types, the goal being to characterize 20-25 different cancer types. However, the program was so successful that it surpassed that goal, eventually analyzing the genomic underpinnings of 33 different types of cancer, including 10 rare cancer types.
TCGA's success emanates from a combination of its scientific approaches, which capitalized on the plummeting costs of DNA sequencing, and its collaborative style, which involved engaging researchers from multiple disciplines. TCGA researchers, at more than two dozen institutions across North America, used a variety of techniques to analyze the tumor and non-tumor samples, including exome and whole-genome sequencing, gene-expression profiling, copy-number variation profiling, and single-nucleotide polymorphism genotyping, among others. The samples from cancer patients were provided by tissue-source sites around the globe. In this way, researchers not only looked for the molecular differences that make one cell cancerous and one cell normal, but also compare and contrast different cancer types, providing the scientific community a comprehensive, multi-dimensional map for future investigations. Further, as with all NIH data and resources, the TCGA dataset, comprising more than two petabytes of genomic data, has been made publicly available for researchers worldwide. Access to TCGA datasets is now provided through the NCI Genomic Data Commons. This expansive dataset provides researchers with an unprecedented understanding of how, where, and why tumors arise in humans, enabling better informed clinical trials and future treatments.
As the program draws to an end, researchers from the TCGA consortium recently published a capstone analysis, known as the PanCancer Atlas - a collection of 29 papers published across a suite of Cell Press journals that summed up the current status of knowledge about cancer genomics based on the TCGA data. The PanCancer Atlas was divided into three main categories, each anchored by a summary paper that recapped the core findings in that category. The main areas included cell of origin, oncogenic processes, and oncogenic pathways. Multiple companion papers reported in-depth explorations of individual topics within these categories. The entire collection of PanCancer Atlas papers can be found online at Cell's website.
The TCGA consortium is also a member of the International Cancer Genomics Consortium (ICGC), which performed its own global analysis of data collected from all available tumor types. The ICGC PanCancer Analysis of Whole Genomes (PCAWG) combined whole-genome sequences from the international projects, with TCGA data providing about half of the data. This challenging task of harmonizing data across international cohorts was only feasible using new paradigms in analysis and computing that are rapidly taking hold in biomedicine.
Finally, as the decade-long TCGA effort wraps up, there will be a three-day symposium, TCGA Legacy: Multi-Omic Studies in Cancer, in Washington, D.C., on September 27-29, 2018. That meeting, which will focus on the future of large-scale cancer studies, will showcase the latest advances in characterizing the genomic architecture of cancer and recent progress toward therapeutic targeting. The deadline to submit abstracts is June 15, 2018 and can be submitted here.