The Genomic Data Commons: Unleashing the Power of Cancer Multi-omics for Scientific Discovery

Zhenyu Zhang

Director of Bioinformatics, Center for Translational Data Science, University of Chicago

Seminar Date: June 26, 2023 3:30pm - 4:30pm

Location: University of Chicago, Knapp Center for Biomedical Discovery, Room 1103

Title: The Genomics Data Commons: Unleashing the Power of Cancer Multi-omics for Scientific Discovery

Abstract:
The NCI’s Genomic Data Commons (GDC), with 3.8 PB data from 87,000 cancer patients and 68 primary sites, is one of the world’s largest cancer genomics data resources for the research community. Within the GDC, more than 40 data analysis pipelines harmonize and extract information derived from different experimental strategies including whole genome, whole exome, RNA-Seq, miRNA-Seq, genotyping array, methylation array, protein array, scRNA-Seq, and others. The GDC was one of the first large scale biomedical data platforms to support open APIs so that all its data is Findable, Accessible, Interoperable and Reusable (FAIR), and this is one of the reasons that a rich data ecosystem of applications has been developed around it.

During this presentation, we will provide a comprehensive overview of the GDC infrastructure, the multitude of data types it encompasses, and the intricate analysis pipelines that underpin its functionality. Moreover, we will offer insights into the future enhancements and advancements that the GDC is poised to introduce.

Bio: Dr. Zhang is an enthusiastic data analyst and technical leader, with over 20 years of experience in biomedical research and data analysis. His graduate work in Fudan University involves designing and testing of genetic engineered peptide and nucleotide vaccines against animal viruses. His Ph.D. study in the University of Texas used fruitflies to investigate the role of alternative splicing in embryonic development. During his research of fruitfly genetics, he found his love of data analysis and pursued a M.S. degree in Statistics. He then joined the University of Chicago for postdoctoral studies in human genetics and cancer biology, focusing on transcriptome in engineered cell fusions and cancer pharmacogenomics. Dr. Zhang joined CTDS in 2013, and was one of the founding members in designing and implementing the NCI Genomics Data Commons (GDC). He is currently the Director of Bioinformatics in CTDS. He also serves as the Co-Principal Investigator of the GDC, responsible for defining and driving strategic directions of the cloud-based data sharing and analysis platforms.