CTDS Core Faculty and PIs

Robert L. Grossman

Robert L. Grossman, Ph.D., is the Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science and the Jim and Karen Frank Director of the Center for Translational Data Science at the University of Chicago. He joined the faculty in 2010 and has served as the chief research informatics officer of the Biological Sciences Division since 2011. He is also the Chief of the Section of Biomedical Data Science in the Department of Medicine.

He is the principal investigator for the National Cancer Institute Genomic Data Commons (GDC), a platform for the cancer research community that manages, analyzes, integrates, and shares large-scale genomic datasets in support of precision medicine. The GDC was used by more than 100,000 researchers in the past year. He has also built data commons to support research in other areas, including cardiovascular diseases, infectious diseases, gastrointestinal diseases, and the environment. His research interests include data science, machine learning, and deep learning.

He earned his Ph.D. in applied mathematics at Princeton University and an AB in mathematics from Harvard University.


Phil Schumm

L. Philip Schumm, M.A., is the Director of Biostatistics and Statistical Computing in the Center for Translational Data Science. Prior to this he was the Director of the Research Computing Group and Assistant Director of the Biostistics Laboratory in the Department of Public Health Sciences, where he had been since 1996.

He is the Co-PI for the HEAL Data Platform, a Gen3 data mesh providing access to data from studies funded by NIH's Helping to End Addiction Long-term (HEAL) Initiative. He also co-leads the Data and Analytics Support Core (DASC) within NIDA's Justice Community Opioid Innovation Network's (JCOIN) Methodology and Advanced Analytics Resource Center (MAARC), for which he has built a Gen3 data commons. He has built data commons and analytic platforms for several other NIH-funded research consortia and groups, including a platform at the University of Chicago for secure management and analysis of Medicare and Medicaid data from CMS. He is a co-investigator and principal biostatistician for the NIA-funded National Social Life, Health and Aging Project (NSHAP). His current methodological work focuses on the measurement and modeling of social networks, cognitive and sensory function, and physical activity, all of which are critical to understanding differences in health trajectories at older ages.

He earned his M.A. in statistics from the University of Chicago, and joined the Center in 2023.


Aarti Venkat

Aarti Venkat, Ph.D. is the Director of Clinical Informatics in the Center for Translational Data Science and Assistant Professor of Medicine at the University of Chicago.

Her research is broadly focused on oncology, specifically i) developing bioinformatics and machine learning methods for multimodal cancer data and ii) building data meshes, systems that researchers can use to perform federated search and learning. She is Co-PI for the Biomedical Research Hub, a leading example of a data mesh for biomedical research.

She earned her Ph.D. in Human Genetics at the University of Chicago, M.S. in Bioinformatics at the University of Illinois Urbana-Champaign, M.S. in Biochemistry at Seth G.S. Medical College, and B.S. in Life Sciences at St. Xavier’s College. She joined the Center and University in 2022.


Zhenyu Zhang

Zhenyu Zhang, Ph.D. is the Director of Bioinformatics in the Center for Translational Data Science at the University of Chicago.

He is a Co-PI and key contributor to the design and development of the National Cancer Institute Genomic Data Commons, a unified repository and cancer knowledge base that enables data sharing across cancer genomic studies in support of precision medicine. His research interests include multi-omics data analysis, statistical modeling and method development, and machine learning.

He earned his Ph.D. in Molecular Biology and MS in Statistics at the University of Texas at Austin. He joined the University in 2010 and has been affiliated with the Center since 2013.


Affiliated Faculty

Publications and grants relevant to CTDS are included for each faculty member.

Yuxin Chen

Yuxin Chen is an Assistant Professor of Computer Science at the University of Chicago. He is also a member of the Committee on Computational and Applied Mathematics and an affiliated faculty at the Data Science Institute. Before joining the University of Chicago, he was a postdoctoral scholar in the Department of Computing and Mathematical Sciences at the California Institute of Technology; prior to that, he received his Ph.D. in computer science from ETH Zurich in 2017. Chen’s research interest lies broadly in probabilistic reasoning and machine learning. More specifically, his research centers around the fundamentals of resource-efficient learning for real-world adaptive experimental design, with the goal of bridging the gap between theory and practice in active learning. He was a recipient of the Google European Doctoral Fellowship in Interactive Machine Learning, the Swiss SNSF Early Postdoc Mobility Fellowship, and the PIMCO Postdoctoral Fellowship in Data Science.

Papers

Zhang, R., Khan, A. A., Chen, Y., & Grossman, R. L. (2023). Enhancing Instance-Level Image Classification with Set-Level Labels. International Conference on Learning Representations (ICLR).

Zhang, R., Khan, A. A., Grossman, R. L., & Chen, Y. (2024). Scalable Batch-Mode Deep Bayesian Active Learning via Equivalence Class Annealing. International Conference on Learning Representations (ICLR).

Maryellen L. Giger

Maryellen L. Giger, Ph.D. is the A.N. Pritzker Distinguished Service Professor of Radiology, Committee on Medical Physics, and the College at the University of Chicago. She is also the Vice-Chair of Radiology (Basic Science Research) and past Director of the CAMPEP-accredited Graduate Programs in Medical Physics/ Chair of the Committee on Medical Physics at the University.

She is the PI for the NIBIB Medical Imaging and Data Resource Center (MIDRC), which is based upon the Gen3 software.

For over 30 years, she has conducted research on computer-aided diagnosis, including computer vision, machine learning, and deep learning, in the areas of breast cancer, lung cancer, prostate cancer, lupus, and bone diseases.

She is a former president of the American Association of Physicists in Medicine and a former president of the SPIE (the International Society of Optics and Photonics) and was the inaugural Editor-in-Chief of the SPIE Journal of Medical Imaging.

She is a member of the National Academy of Engineering (NAE) and was awarded the William D. Coolidge Gold Medal from the American Association of Physicists in Medicine, the highest award given by the AAPM. She is a Fellow of AAPM, AIMBE, SPIE, SBMR, IEEE, and IAMBE. In 2013, Giger was named by the International Congress on Medical Physics (ICMP) as one of the 50 medical physicists with the most impact on the field in the last 50 years.

She has more than 200 peer-reviewed publications (over 300 publications), has more than 30 patents, and has mentored over 100 graduate students, residents, medical students, and undergraduate students.

Grants

Medical Imaging and Data Resource Center (MIDRC) for Rapid Response to COVID-19 Pandemic. Award number 75N92020D00021-P00003-759202000001-1 . PI: Maryellen Giger, University of Chicago. National Institute for Biomedical Imaging and Bioengineering. 2020-2022.

Medical Imaging and Data Resource Center (MIDRC) - Medical Imaging in the Biomedical Data Fabric. Advanced Research Projects Agency for Health. 2023-2025.

Papers

Grossman, R. L., Giger, M. L., Johnson, J. A., Marks, J. D., Ridgway, J. P., Solway, J., & Stadler, W. M. (2023). Principles and Guidelines for Sharing Biomedical Data for Secondary Use: The University of Chicago Perspective. arXiv preprint arXiv:2302.02425.


Haryadi Gunawi

Haryadi S. Gunawi is an Associate Professor in the Department of Computer Science at the University of Chicago where he leads the UCARE research group (UChicago systems research on Availability, Reliability, and Efficiency). He received his Ph.D. in Computer Science from the University of Wisconsin, Madison in 2009. He was a postdoctoral fellow at the University of California, Berkeley from 2010 to 2012. His current research focuses on cloud computing reliability and new storage technology. He has won numerous awards including NSF CAREER award, NSF Computing Innovation Fellowship, Google Faculty Research Award, NetApp Faculty Fellowships, and Honorable Mention for the 2009 ACM Doctoral Dissertation Award.

Grants

Collaborative Research: CSR: Small: ALL-IN-ONE: Strengthening the System Aspects of Large-Scale Genomics Processing Platforms. Award Number 2416213. PI: Robert Grossman (University of Chicago), Haryadi Gunawi (University of Chicago). National Science Foundation. 2024-2027.

Papers

Putra, M. L., Kim, I. K., Gunawi, H. S., & Grossman, R. L. (2023, December). CNT: Semi-Automatic Translation from CWL to Nextflow for Genomic Workflows. In 2023 IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE) (pp. 22-27). IEEE Computer Society.

Tong, M. H., Grossman, R. L., & Gunawi, H. S. (2021). Experiences in Managing the Performance and Reliability of a {Large-Scale} Genomics Cloud Platform. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (pp. 973-988).

Aly A. Khan

Aly A. Khan, Ph.D. is a research faculty member in the Department of Pathology at the University of Chicago. His lab focuses on developing novel computational methods to better understand how immune cells interact with each other, the surrounding tissue and organ systems, and the microbiome. Prior to joining the University of Chicago in 2019, Dr. Khan was a member of the research faculty at the Toyota Technological Institute in Chicago, where he led an independent research program in computational immunology. In addition to his academic work, he has worked on computational biology in industry, including at Merck, Genentech, and Tempus Labs. He received his Ph.D. in Computational Biology jointly from Cornell University and Memorial Sloan Kettering Cancer Center.

Papers

Song, S., Mohsin, E., Kuznetsov, A., Weber, C., Grossman, R. L., & Khan, A. A. (2023, October). ATAT: Automated Tissue Alignment and Traversal. In NeurIPS 2023 AI for Science Workshop

Zhang, R., Khan, A. A., Chen, Y., & Grossman, R. L. (2023). Enhancing Instance-Level Image Classification with Set-Level Labels. arXiv preprint arXiv:2311.05659.

Zhang, R., Khan, A. A., Grossman, R. L., & Chen, Y. (2021). Scalable Batch-Mode Deep Bayesian Active Learning via Equivalence Class Annealing. arXiv preprint arXiv:2112.13737.

Zhang, R., Khan, A. A., & Grossman, R. L. (2020, October). Evaluation of hyperbolic attention in histopathology images. In 2020 IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE) (pp. 773-776). IEEE.

Alexander Pearson

Alexander Pearson, MD, Ph.D. is Assistant Professor of Medicine in the Sections of Hematology/Oncology and Computational Biomedicine and Biomedical Data Science at University of Chicago, and co-director of the head/neck cancer program at the University of Chicago Comprehensive Cancer Center.

He runs a laboratory that has a combined focus on integrating cancer biology techniques into mathematical modeling frameworks as well as developing machine learning-based cancer biomarkers. His goal as a practicing physician is to maintain a cutting-edge, data-driven, research-intensive clinical practice with clinical trial protocols evaluating scientific questions across the full natural history of solid tumors. He is a highly collaborative computational oncology researcher, and his research partners include multiple branches of the National Institutes of Health, the US Department of Energy, and multiple research foundations.

He earned his MD and Ph.D. in Statistics from the University of Rochester as part of the Medical Scientist Training Program. He completed an internship, Internal Medicine Residency, and Hematology/Oncology Fellowship at the University of Michigan Physician Scientist Training Program. He joined the University in 2017 and has been affiliated with the Center since 2020.

Papers

Dolezal, J. M., Wolk, R., Hieromnimon, H. M., Howard, F. M., Srisuwananukorn, A., Karpeyev, D., Ramesh, S., Kochanny, S., Kwon, J. W., Agni, M., Simon, R. C., Desai, C., Kherallah, R., Nguyen, T. D., Schulte, J. J., Cole, K., Khramtsova, G., Garassino, M. C., Husain, A. N., Li, H., … Pearson, A. T. (2023). Deep learning generates synthetic cancer histology for explainability and education. NPJ precision oncology, 7(1), 49. https://doi.org/10.1038/s41698-023-00399-4

Hieromnimon, H. M., Dolezal, J., Doytcheva, K., Howard, F. M., Kochanny, S., Zhang, Z., … Riesenfeld, S. J. (2023). Latent transcriptional programs reveal histology-encoded tumor features spanning tissue origins. bioRxiv. doi:10.1101/2023.03.22.533810

Howard, F. M., Dolezal, J., Kochanny, S., Schulte, J., Chen, H., Heij, L., Huo, D., Nanda, R., Olopade, O. I., Kather, J. N., Cipriani, N., Grossman, R. L., & Pearson, A. T. (2021). The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nature communications, 12(1), 4423. https://doi.org/10.1038/s41467-021-24698-1

John Schneider

Dr. John Schneider MD, MPH, Professor of Medicine and Epidemiology, is a network epidemiologist and infectious disease specialist in the Departments of Medicine and Public Health Sciences at the University of Chicago. He is also Director of the University of Chicago Center for HIV Elimination (http://hivelimination.uchicago.edu/). His NIH and CDC funded research focuses on social network interventions that can lead to disease elimination in most vulnerable populations in resource restricted settings. In these settings he also implements network interventions and computational modeling to eliminate new transmission events in the case of HIV. Clinically, he specializes in HIV prevention and has a specific interest in the provision of high-quality care to young Black men who have sex with men and transgender women. He has extensive experience with advancing the physician patient relationship in resource restricted settings, including his current clinic at a Federally Qualified Health Center on the South Side of Chicago – Howard Brown Health 55th - and during his previous time working in Southern India.

Grants

Community network driven COVID-19 testing of vulnerable populations in the Central US. Award Number 3UG1DA050066-03S1. PI: Harold Alexander Pollack (Unversity of Chicago), Mai Tuyet Pho (Unversity of Chicago), John Schneider (Unversity of Chicago). National Institute on Drug Abuse. 2019-2023.

Methodology and Advanced Analytics Resource Center (MAARC). 1U2CDA050098-01. PI: John Schneider (Unversity of Chicago) and Harold Alexander Pollack (Unversity of Chicago). National Institute on Drug Abuse. 2019-2024.

 

Samuel Volchenboum

Dr. Sam Volchenboum is a Professor in the Department of Pediatrics and Associate Chief Research Informatics Officer for the Biological Sciences Division. He is the Associate Dean of Master’s Education and the Informatics Lead for the Institute for Translational Medicine. He is the Program Director for the Clinical Informatics Fellowship Program.His clinical specialty is pediatric hematology/oncology, caring for kids with cancer and diseases of the blood.

In addition to his clinical practice, he directs the Data for the Common Good, a research group dedicated to liberating and democratizing data. Their largest project, the Pediatric Cancer Data Commons, is the world’s biggest publicly-available repository for data from children with cancer. Until 2019, Dr. Volchenboum directed the Center for Research Informatics, a 40-person group that supports biological research throughout the division. As director of this center, he oversaw high-performance computing, HIPAA-compliant storage and backup, application development to support clinical trials, development and maintenance of the clinical trials management system, the clinical research data warehouse, data analytics and visualization, and bioinformatics, including high-throughput genomic analyses and machine learning.

Papers

Wyatt, K. D., Graglia, L., Furner, B., Kang, B., Fitzsimons, M., Grossman, R. L., & Volchenboum, S. L. (2024). An open-source platform for pediatric cancer data exploration: a report from Data for the Common Good. JAMIA open, 7(1), ooae004.

Bao, R., Spranger, S., Hernandez, K., Zha, Y., Pytel, P., Luke, J. J., Gajewski, T. F., Volchenboum, S. L., Cohn, S. L., & Desai, A. V. (2021). Immunogenomic determinants of tumor microenvironment correlate with superior survival in high-risk neuroblastoma. Journal for immunotherapy of cancer, 9(7), e002417. https://doi.org/10.1136/jitc-2021-002417

Bao, R., Spranger, S., Hernandez, K., Zha, Y., Pytel, P., Luke, J. J., Gajewski, T. F., Volchenboum, S. L., Cohn, S. L., & Desai, A. V. (2021). Immunogenomic determinants of tumor microenvironment correlate with superior survival in high-risk neuroblastoma. Journal for immunotherapy of cancer, 9(7), e002417. https://doi.org/10.1136/jitc-2021-002417