DNA copy number variation and cancer risk John F Pearson Canterbury Statistics Open Day University of Canterbury 2/10/2012.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

Which Phenotypes Can be Predicted from a Genome Wide Scan of Single Nucleotide Polymorphisms (SNPs): Ethnicity vs. Breast Cancer Mohsen Hajiloo, Russell.
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
Genomics, Cancers & Infectious Diseases Qunyuan Zhang Division of Statistical Genomics Washington University School of Medicine.
Profiles for Sequences
Methods for copy number variation: hidden Markov model and change- point models.
The neuroblastoma genome Studies of genomic alterations using copy number microarray analyzes Tommy Martinsson Department of Clinical Genetics Sahlgrenska.
Tumour karyotype Spectral karyotyping showing chromosomal aberrations in cancer cell lines.
Teresa Przytycka NIH / NLM / NCBI RECOMB 2010 Bridging the genotype and phenotype.
1 FSTL4 and SEMA5A are associated with alcohol dependence: meta- analysis of two genome-wide association studies Kesheng Wang, PhD Department of Biostatistics.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
Aspects of Genetics and Genomics in Cancer Research Li Hsu Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center.
STAC: A multi-experiment method for analyzing array-based genomic copy number data Sharon J. Diskin, Thomas Eck, Joel P. Greshock, Yael P. Mosse, Tara.
Genotype Error Detection using Hidden Markov Models of Haplotype Diversity Justin Kennedy, Ion Mandoiu, Bogdan Pasaniuc CSE Department, University of Connecticut.
Clinical Applications of Whole Genome/Whole Exome Sequencing Robert L. Nussbaum, MD, FACMG Division of Genomic Medicine, UCSF AMA – November 11, 2012.
Multi-dimensional Genomic Profiling of Acute Leukemias Characterized by MLL gene rearrangements Eunice S. Wang MD (Medicine) and Norma J. Nowak PhD (Cancer.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Supplementary Figure 1. Somatic mutation spectrum # Substitutions # Substitutions per Mb b c a Repeats Pseudogenes Whole genome Splice sites Non-coding.
The genetic epidemiology of common hormonal cancers Deborah Thompson Centre for Cancer Genetic Epidemiology.
Genome Variations & GWAS
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
Constitutional (germ-line) variants in hereditary conditions
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
GENOMIC COPY NUMBER Rudy Guerra Department of Statistics Rice University April 14, 2008.
Copy Number Variants: detection and analysis Manuel Ferreira & Shaun Purcell Boulder, 2009.
Unknown genetic predisposition in familial breast cancer can lie deep in family tree San Ming Wang University of Nebraska Medical Center.
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
DNA Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
Next-Generation Sequencing
Figure S1. Quantile-quantile plot in –log10 scale for the individual studies The red line represents concordance of observed and expected values. The shaded.
Affymetrix CytoScan HD array
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Cancer Genome Assemblies and Variations between Normal and Tumour Human Cells Zemin Ning The Wellcome Trust Sanger Institute.
Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.
Methods in genome wide association studies. Norú Moreno
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Identification of Copy Number Variants using Genome Graphs
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
____ __ __ _______Birol et al :: AGBT :: 7 February 2008 A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE.
Maxwell Lee National Cancer Institute Center for Cancer Research High-dimension Data Analysis Group March 19, 2014 Integrated Studies Of Breast, Esophageal,
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
California Pacific Medical Center
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
A PPROACHING THE G ENOME - G ENETIC M ARKERS, L INKAGE AND A SSOCIATION G ENETICS 202 Jon Bernstein Department of Pediatrics October 8, 2015.
Copy Number Analysis in the Cancer Genome Using SNP Arrays Qunyuan Zhang, Aldi Kraja Division of Statistical Genomics Department of Genetics & Center for.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Tumor Heterogeneity: From biological concepts to computational methods Bo Li, PhD Dana Farber Cancer Institute Harvard Statistics Department.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Nucleotide variation in the human genome
Global Variation in Copy Number in the Human Genome
Kendy K. Wong, Ronald J. deLeeuw, Nirpjit S. Dosanjh, Lindsey R
Figure 2 Copy-number variations in multiple myeloma
Genomic alterations in breast cancer cell line MDA-MB-231.
Histology and genomic copy number alterations in TRAMP tumors.
Volume 8, Issue 1, Pages (January 2011)
Meiotic Microdeletion Breakpoints in the BRCA1 Gene Are Significantly Associated with Symmetric DNA-Sequence Elements  Beatrice Schmucker, Michael Krawczak 
Gene copy number analysis by multiplexed quantitative differential PCR for BRCA1 and BRCA2 for 23 DNA specimens obtained from patients of Hispanic ancestry.
Presentation transcript:

DNA copy number variation and cancer risk John F Pearson Canterbury Statistics Open Day University of Canterbury 2/10/2012

2 Breast Cancer Foulkes WD. N Engl J Med 2008; 359:

3 Missing heritability TA Manolio et al. Nature 461, (2009) doi: /nature08494

4 Evan E. Eichler.

5 Copy number variation Allele 1 Allele 2 Copy number loss Copy number gain Whole gene Partial gene Contiguous genes Regulatory effects

6 Copy number variants (CNVs) 16,000 copy number variant loci cover >50% of the human genome CNVs are associated with cancer risk Rare CNVs detected in ~50% of familial cancer genes eg. BRCA1, BRCA2 Genome-wide association studies of cancer prostate cancer, hepatocarcinoma, nasopharyngeal carcinoma, and neuroblastoma Increased CNV load Li Fraumeni Syndome (cancer related genes?) breast cancer (TP53 pathway, ESR1 pathway)

7 SNP arrays LRR = log 2 (R observed /R expected ) The B Allele Frequency (BAF) is a somewhat confusing term that actually refers to a normalized measure of relative signal intensity ratio of the B and A alleles Wang et al Genome Res November; 17(11): 1665–1674.

8 Genomic location

9 Copy number AA AB B NormalCopy neutral LOH Copy number loss

10 Copy number gain AAA AAB ABB BBB

11 Illumina bead arrays. o CNVision (workflow software) o Gnosis o PennCNV o QuantiSNP o CNV Partition CNV calling CNV calling algorithms

12 Hidden Markov Model Estimate copy number at each SNP from Log R ratio B allele frequency transition probability at previous SNP. PennCNV, QuantiSNP

13 PennCNV

14 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state The likelihood of the observed data is:

15 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state The likelihood of the observed data is: LRR emission probability model includes a term for chemical fluctuations and misannotation/assembly BAF emission probability complicated mixture model

16 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state Transmission probabilities between 2 adjacent SNPs i -1 and i. with copy numbers z i and z i-1 at distance d i. D = 100Mb for state 4, 100kb for other states. p are unknowns, estimated by the Baum-Welch algorithm.

17 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state Baum-Welch used to train the model Viterbi algorithm used to infer most likely path CNV called whenever a stretch of states is different from normal ( usually state 3 or 4)

18 Copy number gain AAA AAB ABB BBB

19 Noisy data

20 Breast cancer A characteristic of breast tumour cells is genomic instability BRCA1, BRCA2

21 BRCA1: known large deletions Sample IDBRCA1 mutation EMB del exons 2-24 EMB del exons 3-19 EMB del exons 1-23 EMB del exons1-21 EMB del exons 1-23 EMB del exons 1-23 EMB del exons1-21 GEM del exons PAD del exons 9-19 EMB del exons 1-17 EMB del exons 1-17 KCO del exons 1-17 EMB del exons 8-13 GEM del exons 8-13 Sample IDBRCA1 mutation EMB del exons 3-19 EMB del exons 1-17 Detected Not detected CNV prediction summary: cnvPartition - 25% (4/16) GNOSIS- 19% (3/16) PennCNV- 88% (14/16) QuantiSNP- 81% (13/16)

22 CNV calling by 4 algorithms QC(1) – GWAS criteria Endometrial cancer 1343 cases ANECS, SEARCH 1343 cases ANECS, SEARCH 655 female controls Hunter Community Study 655 female controls Hunter Community Study Case vs. control analyses 1279 cases 619 controls 1210 cases 612 controls Want to find: 1.CNVs overlapping known susceptibility genes 2.novel CNVs in the mismatch repair pathway 3.common or rare CNVs associations

23 CNV frequency: all CaseControlDifferenceP 1, Total CNVs NS Deletions NS Duplications NS Exons NS Mean CNV per sample

24 CNV frequency: rare (< 1%) CaseControlDifferenceP 1, Total CNVs E-05 Deletions E-06 Duplications NS Exons E-04 Mean rare CNV per sample

25 CNV frequency: rare (< 1%) CaseControlDifferenceP 1, Total CNVs E-05 Deletions E-06 Duplications NS Exons E-04 Mean rare CNV per sample

26 Association study CaseControl P adjusted Chr X X X X X X CNV Regions

27 Association study CNV overlapping genes CaseControl P adjusted Chr X X

28

29 Acknowledgements University of Otago Gemma Moir-Meyer Logan Walker Mackenzie Cancer Research Group Queensland Institute of Medical Research Mandy Spurdle Felicity Lose Yen Tan Alex Metcalf Australian National Endometrial Cancer Study Bryony Thompson University of Cambridge Deborah Thompson Paul Pharoah Alison Dunning Douglas Easton Studies of Epidemiology and Risk Factors in Cancer Heredity (SEARCH) University of Newcastle Rodney Scott Mark McEvoy John Attia Elizabeth Holliday The Hunter Community Study CIMBA consortium MAYO clinic Fergus Couch