Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.

Similar presentations


Presentation on theme: "Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012."— Presentation transcript:

1 Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012

2 What do we mean by “copy number variation?” GCTCATATATATTTG kb - Mb (gene or gene region)

3 Copy number variation in a gene or gene region “normal” duplication of one gene duplication of several genes deletion duplication of part of a gene

4 What Find chromosomal segments (usually large ones) that are duplicated and/or deleted in tumor cell lines Why Learn something about cancer biology or Implications for treatment and prognosis Cancer genetics Clinical pediatrics What Detect inherited or de novo deletions in individuals Why “Diagnose” birth defects Classical copy number study types

5 And now: Genetic association studies for CNVs 1) Collect cases and controls. 5 2) “Genotype” everyone at a CNV. 2 0 4 0 2 1 1 1 1 3 16 2 4 1 02 3) Test genotype/phenotype association. 012+ cases65133202 controls1681316

6 How do we assay copy number variation?

7 What Microarray of clones (e.g. BACs) Usually on glass slide Competitive hybridization of test and reference samples. Measure fluorescence ratio clone by clone. Limitations Large clones. Sparse coverage. High noise due to spotting process. Generation 1 - Array CGH

8 What High-throughput SNP genotyping platforms (e.g. Affymetrix, Illumina) Disadvantages Technology was never intended for measuring copy number. SNPs on chip selected to avoid CNV regions by design. Generation 2 - SNP chips Advantage Hundreds of thousands of points of info.

9 Advantages SNPs in known CNV regions are now included. Also have “non-polymorphic SNPs” (SNs?) Generation 3 - SNP chips with CNV markers (Affy 6.0, Illumina 1M) Affymetrix 200K probes in 5K known large CNV regions 700K probes “evenly spaced along the genome” Illumina 1M markers in 10K regions of various types and sizes

10 Changes Got rid of the non-polymorphic markers. Special coverage of CNV regions??? Are these better or worse for CNVs than the previous generation? Generation 4 - (Illumina 2.5M, 5M)

11 What data do these technologies give us, and how do we use it?

12 BB AB AA Standard genotyping Genotype information is in the angle (relative intensity of the two alleles). Copy number information is in the distance from the origin (total intensity).

13 AAA AAB ABB BBB AA AB BB A B null In theory

14 AAA and AA AAB AB ABB BBB and BB But when you look at the data … trisomic (Down Syndrome) disomic

15 total intensity (disomic) total intensity (trisomic) trisomic disomic All SNPs on chromosome 21

16 AAA AAB ABB BBB AA AB BB A B null In theory

17 A B null In practice

18 So how are copy numbers called? Look for runs of SNPs that are high or low in intensity Many available algorithms e.g. HMM, CBS, change-point

19 Basic picture

20 Komura et al. Genome Research 2006

21 More complex examples (cancer genetics) Peiffer et al. Genome Research, 2006

22 Angle (genotype info) total intensity amplification AA AB BB

23 deletion

24 Extra copy of whole chromosome total intensity high over whole chromosome 3 genotype groups

25 No copy number change, but a region of homozygosity (LOH) LOH

26 Basic picture Wang et al. Genome Research, 2007

27

28

29 29 Chromosome 9

30 A few statistical issues to think about … (there’s still a lot to do)

31 Many run-calling algorithms are oriented towards clinical applications. Many CNV detection algorithms are very conservative - aim for zero false positive rate. Most use normalization methods that assume a large reference population is not available. Many use models that make assumptions about what kinds of variation are likely (e.g. cancer).

32 Family data should be modeled together. CNV “calls” will be much more accurate if you use the whole family, but the model you use should depend on whether you are expecting de novo mutations or not. For some diseases you’ll expect associations with de novo changes. For others you might expect inherited variants.

33 How do we group CNVs for association testing? deletion duplication

34 Separate methods for deletions? Deletions are easier to detect than other changes. Deletions are likely to have simpler biological effects.

35 The most important one … The technology is still NOT intended for reliably and comparably measuring total intensity! Total intensity numbers are very sensitive to DNA source, sample handling, etc., so extreme measures must be taken to ensure that cases and controls are comparable.


Download ppt "Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012."

Similar presentations


Ads by Google