5 Copy number variation Allele 1 Allele 2 Copy number loss Copy number gain Whole gene Partial gene Contiguous genes Regulatory effects
6 Copy number variants (CNVs) 16,000 copy number variant loci cover >50% of the human genome CNVs are associated with cancer risk Rare CNVs detected in ~50% of familial cancer genes eg. BRCA1, BRCA2 Genome-wide association studies of cancer prostate cancer, hepatocarcinoma, nasopharyngeal carcinoma, and neuroblastoma Increased CNV load Li Fraumeni Syndome (cancer related genes?) breast cancer (TP53 pathway, ESR1 pathway)
7 SNP arrays LRR = log 2 (R observed /R expected ) The B Allele Frequency (BAF) is a somewhat confusing term that actually refers to a normalized measure of relative signal intensity ratio of the B and A alleles Wang et al Genome Res. 2007 November; 17(11): 1665–1674.
14 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state The likelihood of the observed data is:
15 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state The likelihood of the observed data is: LRR emission probability model includes a term for chemical fluctuations and misannotation/assembly BAF emission probability complicated mixture model
16 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state Transmission probabilities between 2 adjacent SNPs i -1 and i. with copy numbers z i and z i-1 at distance d i. D = 100Mb for state 4, 100kb for other states. p are unknowns, estimated by the Baum-Welch algorithm.
17 PennCNV r i LRR b i BAF at SNP i. ( 1 ≤ i ≤ M ) z i copy number state Baum-Welch used to train the model Viterbi algorithm used to infer most likely path CNV called whenever a stretch of states is different from normal ( usually state 3 or 4)
22 CNV calling by 4 algorithms QC(1) – GWAS criteria Endometrial cancer 1343 cases ANECS, SEARCH 1343 cases ANECS, SEARCH 655 female controls Hunter Community Study 655 female controls Hunter Community Study Case vs. control analyses 1279 cases 619 controls 1210 cases 612 controls Want to find: 1.CNVs overlapping known susceptibility genes 2.novel CNVs in the mismatch repair pathway 3.common or rare CNVs associations
23 CNV frequency: all CaseControlDifferenceP 1,210612 Total CNVs26.726.50.2NS Deletions17.718.1-0.4NS Duplications220.127.116.11NS Exons18.104.22.168NS Mean CNV per sample
24 CNV frequency: rare (< 1%) CaseControlDifferenceP 1,210612 Total CNVs22.214.171.124E-05 Deletions126.96.36.199.0E-06 Duplications188.8.131.52NS Exons184.108.40.206E-04 Mean rare CNV per sample
25 CNV frequency: rare (< 1%) CaseControlDifferenceP 1,210612 Total CNVs220.127.116.11E-05 Deletions18.104.22.168.0E-06 Duplications22.214.171.124NS Exons126.96.36.199E-04 Mean rare CNV per sample
26 Association study CaseControl P adjusted Chr01340134 X0100057000.000 X03070078000.000 X0200034000.000 X0000024000.000 691000435000.000 16012512700101900.000 X0000014000.001 681220343820477184276140.003 202200141600.006 70000012400.006 1103832001300.010 X0100001100.016 CNV Regions
27 Association study CNV overlapping genes CaseControl P adjusted Chr01340134 X0200053000.000 10372000000.004 10352000000.004 70010013500.004 10362000000.004 10362000000.004 10342000000.005 10332000000.008 10311000000.011 10311000000.011 70432200000.011 X02260036000.021
29 Acknowledgements University of Otago Gemma Moir-Meyer Logan Walker Mackenzie Cancer Research Group Queensland Institute of Medical Research Mandy Spurdle Felicity Lose Yen Tan Alex Metcalf Australian National Endometrial Cancer Study Bryony Thompson University of Cambridge Deborah Thompson Paul Pharoah Alison Dunning Douglas Easton Studies of Epidemiology and Risk Factors in Cancer Heredity (SEARCH) University of Newcastle Rodney Scott Mark McEvoy John Attia Elizabeth Holliday The Hunter Community Study CIMBA consortium MAYO clinic Fergus Couch