Cancer genomics Yao Fu March 4, 2015. Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.

Slides:



Advertisements
Similar presentations
Evolution of genomes.
Advertisements

Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Discovery of Structural Variation with Next-Generation Sequencing Alexandre Gillet-Markowska Gilles Fischer Team – Biology.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Ruibin Xi Peking University School of Mathematical Sciences
Bioinformatics at Molecular Epidemiology - new tools for identifying indels in sequencing data Kai Ye
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
Genomics, Cancers & Infectious Diseases Qunyuan Zhang Division of Statistical Genomics Washington University School of Medicine.
Next-generation sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department Harvard Nanocourse October 7, 2009.
1000 Genomes SV detection Boston College Chip Stewart 24 November 2008.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Multi-dimensional Genomic Profiling of Acute Leukemias Characterized by MLL gene rearrangements Eunice S. Wang MD (Medicine) and Norma J. Nowak PhD (Cancer.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Supplementary Figure 1. Somatic mutation spectrum # Substitutions # Substitutions per Mb b c a Repeats Pseudogenes Whole genome Splice sites Non-coding.
NGS Workshop Variant Calling
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Whole Exome Sequencing for Variant Discovery and Prioritisation
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
NGS Workshop Variant Calling and Structural Variants from Exomes/WGS
NGS Cancer Systems Biology Workshop Variant Calling and Structural Variants from Exomes/WGS Ramesh Nair May 30, 2014.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Biotechnology SB2.f – Examine the use of DNA technology in forensics, medicine and agriculture.
Constitutional (germ-line) variants in hereditary conditions
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College
Analyzing DNA Differences PHAR 308 March 2009 Dr. Tim Bloom.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Next-Generation Sequencing
High throughput sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department BI543 Fall 2013 January 29, 2013.
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Genomics Method Seminar - BreakDancer January 21, 2015 Sora Kim Researcher Yonsei Biomedical Science Institute Yonsei University College.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Sangwoo Kim, Ph.D. Assistant Professor,
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Identification of Copy Number Variants using Genome Graphs
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
No reference available
Supplemental Figure 1. Bias-corrected NGS bioinformatics strategies. Paired-end DNA sequencing reveals the sequence of the genomic clone, the sample ID.
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
GSVCaller – R-based computational framework for detection and annotation of short sequence variations in the human genome Vasily V. Grinev Associate Professor.
Notes: Human Genome (Right side page)
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Canadian Bioinformatics Workshops
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Genome evolution within the individual
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
A comparison of somatic mutation callers in breast cancer samples and matched blood samples THOMAS BRETONNET BIOINFORMATICS AND COMPUTATIONAL BIOLOGY UNIT.
Canadian Bioinformatics Workshops
Interpreting exomes and genomes: a beginner’s guide
Chapter 14 GENETIC VARIATION.
SVs and CNVs They are often confused…
DNA Marker Lecture 10 BY Ms. Shumaila Azam
What makes a mutant?.
Mutations Changes in the genetic material Gene Mutations
14-3 Human Molecular Genetics
Linking Genetic Variation to Important Phenotypes
Genomic alterations in breast cancer cell line MDA-MB-231.
BF528 - Genomic Variation and SNP Analysis
Canadian Bioinformatics Workshops
Section 20.4 Mutations and Genetic Variation
SNPs and CNPs By: David Wendel.
Presentation transcript:

Cancer genomics Yao Fu March 4, 2015

Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes suggested that specific alterations led to cancer, laying the foundation for cancer genomics. From: E Mardis, NHGRI Current Topics in Genome Analysis 2014 J Rowley, 2008, Blood 1

2 It took ~40 years from the discovery of the translocation to the drug – Gleevec Can we scale up from this example to all cancer genes ?

Somatic Mutations 3 - DNA mitosis - Not subjected to natural selection SNV – single nucleotide variants Indel – small insertion / deletion SVs – structural variations, include – Copy number variants Deletions Insertions/duplications – Inversions – Translocations Feuk et al. Nature Reviews Genetics 7, 85-97

Cancer genome sequencing 4 From: wikipedia

Variant calling – SNP, Indels 5 The Genome Analysis Tool Kit (GATK) Broad Institute

6 UnifiedGenotyper - Call SNPs and indels separately by considering each variant locus independently HaplotypeCaller - Call SNPs, indels and some SVs simultaneously by performing a local de novo assembly Genetic variant or random noise ? -- large scale Bayesian inference

7 Broad Institute UnifiedGenotyper

Variant calling – SNP, Indels 8 MuTect -Reliable and accurate identification of somatic point mutations in cancer genomes. 1.Preprocessing the aligned reads in the tumor and normal sequencing data. 2. Two Bayesian classifiers - whether tumor is non-reference at a given site - makes sure the normal does not carry the variant allele 3. Post-processing of candidate somatic mutations K Cibulskis et al., Nature biotechnology, 2013

Variant calling – SNP, Indels 9 Other Methods: VarScan Strelka SomaticSniper JointSNVMix

Submicroscopic CNVs: Array CGH* *Frequently referred to as “chromosome microarray” From: wikipedia 10

Limitations of Array CGH Can’t detect translocations and inversions Resolution still limited by number of probes on the array—typical resolution about 100 kb Still a fair amount of variability in results depending on exactly which array is used Adopted from Allen E. Bale 11

12 Variant calling - SVs -Copy number variation -Inversions -Translocations  Depth-of-coverage methods Regions that are deleted or duplicated should yield lesser or greater numbers of reads  Paired-end mapping approach Are the sequences at two ends of a fragment both from the same chromosome? Are they the right distance apart?  Split-read approach Direct sequencing of breakpoints  Sequence assembly comparison De novo assembly Adopted from Allen E. Bale

13 Image: Snyder et al. Genes Dev Mar 1;24(5): Ref: Yoon et al. Genome Res September; 19(9): 1586–1592.  Copy number neutral events  repeat-rich regions

Paired-End Mapping Insertion Deletion Reference Target Breakpoint Inversion Breakpoint Inversion Breakpoint Reference Paired-End Sequencing and Mapping Span << expected Span >> expectedAltered end orientation 14 From: Hugo  Both paired-ends map within repeats.  Limited the distance between pairs; therefore, neither large nor very small rearrangements can be detected

Split-read Analysis Deletion Reference Read Reference Read Breakpoint Insertion 15 Target Genome From: Hugo

16

 Depth-of-coverage CNVnator (Abyzov et al., 2011)  Paired-end mapping PEMer (Korbel et al., 2009): For discovery of CNVs and inversions; could also be implemented for translocations Breakdancer (Chen et al., 2009): For discovery of CNVs, inversions, and translocations GenomeSTRiP (Broad institute): whole-genome, integrating read depth, paired end; population level feature  Programs for analysis of longer reads that directly sequence breakpoints CREST (Wang et. al., 2011): Detects small and large structural variants by direct sequencing of breakpoints. SRiC (Zhang et al., 2011): Similar to CREST Algorithm for strobe reads (Ritz et al., 2010) 17 Adopted from Allen E. Bale

Driver vs. Passenger mutations The driver genes are selected for by the growth process. The passenger genes are defined by mutations that occurred in the process but have no effect on the tumor. 18

19 Identify functional genetic variants in cancer

20 Annotate and analyze individual genetic alterations SIFT (Ng et al., 2003): missense SNPs Polyphen2 (Adzhubei et al., 2010): missense SNPs, machine-learning method ANNOVAR (Wang et al., 2010): variant annotation tool CHASM (Cater et al., 2009): trained on known driver missense mutations. PANTHER (Thomas et al., 2006): missense variants MutationAssessor (Reva et al., 2011): missense variants, evolutionary conserved positions that contribute to protein functional specificity. … Base-wise: PhyloP, GERP Small regions: PhastCons

Population based analysis to identify driver genes 21 D Tamborero et al., Comprehensive identification of mutational cancer driver genes across 12 tumor types, nature 2013

Cancer Gene Census 22 M Patel et al., nature reviews drug discovery 2013 The Cancer Gene Census is an ongoing effort to catalogue those genes for which mutations have been causally implicated in cancer.

Tumor heterogeneity 23 Tumor heterogeneity describes differences between tumors of the same type in different patients, and between cancer cells within a tumor.

Whole-exome or whole-genome sequencing? CNV, SV detection Noncoding cancer drivers 24

Pan-Cancer Analysis of Whole Genomes 25 The goal is to analyze the genomes, including genome-wide sequence data, of approximately 2000 pairs of tumor and normal samples, and integrate those results with clinical and other molecular data on the same cases

Dendritic Cell-based Vaccination 26 From: E Mardis, NHGRI Current Topics in Genome Analysis 2014 B Goldman et al., Nature Biotechnology, 2009

Thank you for listening 27