Our goal is to develop novel algorithms and tools to ultimately help patients with cancer. We utilize preclinical (e.g., patient-derived cell lines, patient-derived xenografts (PDX), organoid, etc.) and clinical data from patients including genomic, transcriptomic, and epigenetic datasets , as well as electronic health records (EHRs) and image data (e.g., H&E and IHC digital slides, CT/PET/MRI images, etc.) to identify prognostic/predictive biomarkers, evaluate treatment stratification and predict cancer outcomes.
In particular, we utilize rich and long-term follow-up clinical outcome data from our institute to deliver biomarkers and methodologies that are readily useful in the clinical setting. A unique perspective of our approach is to utilize relevantly large cohorts for a specific treatment/drug, including immunotherapy and novel agents from clinical trials.
These are a few examples for our current projects:
- Identifying clinically useful predictive biomarkers to guide treatment decisions.
The goal of this project is to identify actionable predictive biomarkers based on FDA-approved/CLIA-certified clinical genetic tests from both tumor and liquid biopsies (e.g., Foundation Medicine, Guardant, etc.).
We develop novel machine learning algorithms to integrate genetic data that are currently used in the clinical practice with EHRs to provide better treatment guidance and expand the clinical utility of currently available genetic tests, which tend to be useful for less than 20% of cancer patients.
- Developing novel machine learning (including deep learning) algorithms and tools to integrate image and genomic data with EHRs.
The goal of this project is to develop novel algorithms and tools to identify image-based and/or integrated image and genomic biomarkers to better stratify patients and guide treatment decisions.
- Developing new algorithms and tools for real-time long read sequencing (i.e., Nanopore) and single cell sequencing data to identify novel predictive biomarkers to v=better stratify patients that are likely to benefit from new treatments, and to understand molecular mechanisms of treatment resistance.
In collaboration with Angela Ting lab, we perform and generate our own experimental data using Oxford Nanopore and single cell sequencing technology with preclinical and clinical specimens. We are currently working on identifying predictive biomarkers for immunotherapy applied to lung, kidney, bladder, and melanoma cancers, well as CAR-T cell therapy for lymphoma.
Machine learning and Data Mining of Electronic Health Records
We are generally interested in developing novel machine learning and data mining algorithms and tools which utilize complex heterogeneous datasets to better predict a given patient’s clinical outcomes and biomarker discovery. We utilize one of the largest single institute EHR datasets from Cleveland Clinic, which includes more than 2 million unique patients with > 2 billion total records.
Understanding Phenotype-Genotype Effects in Complex Diseases
We are interested in developing novel graph/network-based computational approaches to accomplish the following: 1) prioritize the novel candidate disease genes identified from high-throughput genomic studies and 2) classify human diseases and discover disease-related pathways based on phenotypic and molecular information. The common goal of these approaches is to develop graph/network-based integrative algorithms to interpret results from the enormous volume of high-throughput genomic studies, to better refine biologically and clinically relevant gene and pathway signatures and to improve disease diagnosis, prognosis and classification. Currently we are expanding our approaches to include EHRs and image data to better understand novel phenotypes and their association with treatment outcomes.
Cancer Biology and Drug Development
We are broadly interested in further understanding the biology of cancer with a focus on translational computational research. We develop and apply innovative and customized computational approaches to experimental and complex datasets from preclinical (e.g., cell lines, organoid, PDXs, human, etc.) for generating new insights, deepening understanding, suggesting and interpreting experiments, and hypothesis testing. Our computational approaches have led to a better understanding of cancer development, evolution, treatment, and drug resistance as well as discovering new biomarkers and therapeutic agents.