Presentation is loading. Please wait.

Presentation is loading. Please wait.

Constitutional (germ-line) variants in hereditary conditions

Similar presentations


Presentation on theme: "Constitutional (germ-line) variants in hereditary conditions"— Presentation transcript:

0 Allen E. Bale, M.D. Dept. of Genetics
SEGMENTAL VARIATION (Copy Number Variants and other gross chromosomal rearrangements) Allen E. Bale, M.D. Dept. of Genetics

1 Importance of Copy Number Variants (CNVs) and Other Rearrangements in Health and Disease
Constitutional (germ-line) variants in hereditary conditions Large and small copy number variants Translocations and inversions: rarely cause a phenotype but may generate CNVs due to mis-pairing during meiosis Somatically acquired variants in cancer Duplications and deletions: amplification of oncogene; loss of tumor suppressor Translocations and inversions: place oncogene under control of an active promoter

2 What is the origin of structural variants?
An area of active research Recurrent constitutional CNVs: Often related to illegitimate recombination between homologous, but non-identical, sequences Rare, non-recurrent, constitutional CNVs: No obvious sequence homology at breakpoints, ?non-homologous end joining Tumor CNVs: Any mechanism to create a rearrangement that favors tumor growth, often non-homologous end joining.

3 Cytogenetically visible CNVs and translocations

4 A Really Large CNV

5 Somatically acquired translocation

6 Limitations of Cytogenetics
Cell has to be proliferating in order to arrest chromosomes at metaphase (when they are visible under the microscope) Resolution is limited (in the range of 5 Mb) Requires highly skilled technologists and still a lot of hands-on time, even with sophisticated image processing

7 Submicroscopic CNVs: Array CGH*
*Frequently referred to as “chromosome microarray”

8 Example: Submicroscopic 22q deletion
Abnormal nose, ears, and palate Also heart, parathyroid, and thymus abnormalities

9 Limitations of Array CGH
Can’t detect translocations and inversions Resolution still limited by number of probes on the array—typical resolution about 100 kb Still a fair amount of variability in results depending on exactly which array is used

10 Genome-scale sequencing to detect rearrangements
If you could sequence each chromosome as one continuous piece of DNA, from one end to the other with no gaps in the sequence, what structural variants would you miss?

11 Genome-scale sequencing to detect rearrangements
What methods are currently in use? Depth-of-coverage methods Regions that are deleted or duplicated should yield lesser or greater numbers of reads Detection of breakpoints by: Short paired reads (like Illumina paired-end sequencing) Are the sequences at two ends of a fragment both from the same chromosome? Are they the right distance apart? Long reads (kb-scale) Direct sequencing of breakpoints

12 Genome-scale sequencing to detect rearrangements
Depth-of-coverage method Detection of breakpoints by short paired reads Detection of breakpoints by long reads Compared with cytogenetics and array CGH, how would the approaches above perform? What would be missed by depth-of-coverage reading? What would be missed by detection of breakpoints? What problems do you foresee with these two approaches?

13 Depth-of-coverage example: Whole exome sequencing as a tool to identify both sequence variants and CNVs 13

14 Whole exome sequencing (see Dr. Lifton’s lecture)
Capture portions of the genome containing exons in order to efficiently sequence coding regions Not designed for CNV detection, but potentially contains information on gene dosage For any gene, the number of fragments captured on the array and sequenced should be proportional to the representation in the starting material

15 Array CGH vs. Exome Sequencing

16 Does this work at all? Total reads on the X chromosome were counted in a series of males and females Gene dosage for the X chromosome in males should be half the gene dosage for the X chromosome in females

17 Does it work for single exons?
Reads counted for each exon of the OTC gene on X chromosome Males should have one half the female dosage. Read number varies among exons due to different capture efficiencies but is consistent subject to subject. Exons with sufficient read numbers show dosage effect. Performs very well for this 70 kb gene taken as a single unit.

18 Approach to scanning the whole genome for CNVs
The genome was divided into 50 kb windows. Intervals with zero reads were removed. Mean number of reads and standard deviations for each interval were calculated from 10 exome sequences. Depth of coverage in a single patient was compared to average and standard deviation of depth of coverage. Algorithms were developed for: Classifying X chromosome as being deleted in males compared with females Classifying X chromosome as being duplicated in females compared with males

19 Chromosomal coverage with non-zero, 50 kb intervals corresponds exactly to density of coding sequences 19

20 Test case: Female with a 338 kb duplication on 5q35 Diagram shows all loci passing initial algorithm

21 Filter #1: Require two adjacent intervals to both be deleted or duplicated

22 Filter #2: Remove “deleted regions” that contain heterozygous variants

23 Filter # 3: Remove intervals with read counts <200

24 Application to 7 subjects with deletions or duplications in 500 kb to 1 Mb range

25 Some problems with use of exome data
Intervals with no genes are not covered (important?) Intervals with large genes having close homologs elsewhere in the genome can not be accurately evaluated. Because this technology is evolving rapidly, the normal standard to which a test sample is compared needs to be a pool of recent exome sequences (huge FDR with non-homogeneous samples).

26 For a review of published depth-of-coverage methods for exome or genome data see: Klambauer, G. et. al. (2012). "cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate." Nucleic Acids Res. Compares several programs, none of which work really well. Two newer programs for exome sequencing are in your reading list.

27 Paired-end methods Illumina HiSeq, the current industry leader in high-throughput sequencing, generates short reads from fragments 200 to 600 bp long. Reading both ends of the same fragment gives you sequences that should lie 200 to 600 bp apart Other methods can generate paired fragments that lie even farther apart

28 Long paired-end methods
Paired end mapping—up to thousands of bp apart From Korbel et al., 2009

29 Identifying Structural Mutations: Deletions & Duplications

30 Identifying Structural Mutations: Inversions

31 Identifying Structural Mutations: Translocations

32 Analyzing structural variations from paired end data
PEMer (Korbel et al., 2009): For discovery of CNVs and inversions; could also be implemented for translocations Breakdancer (Chen et al., 2009): For discovery of CNVs, inversions, and translocations

33 Identifying Structural Mutations with paired end sequence: What goes wrong?

34 How to overcome problems with paired end detection of CNVs
Separating the wheat from the chaff Technical artifacts (ligation of unrelated fragments during library preparation) may be numerous but will be random Artifacts related to homologous sequences (see previous slide) will be reproducible but common to all samples Real structural variants will be reproducible within a sample and not common to all samples How much reading depth do you need to detect the real variants?

35 Toward direct sequencing of breakpoints
Long reads PACbio can generate reads of 1000 bp or so Nanopore sequencing said to generate reads in the 10s of thousands Strobe sequencing with PACbio: Normally read length is limited due to inactivation of polymerase by laser. Short bursts of laser give sample sequences along a stretch of DNA in the 20 kb range.

36 Programs for analysis of longer reads that directly sequence breakpoints
CREST (Wang et. al., 2011): Detects small and large structural variants by direct sequencing of breakpoints. SRiC (Zhang et al., 2011): Similar to CREST Algorithm for strobe reads (Ritz et al., 2010)

37 Conclusions Structural variation in the genome accounts for a great deal of human phenotypic variability including disease Depth-of-coverage methods can detect many CNVs but not inversions and translocations. Variation from sample to sample limits sensitivity and specificity. Whole genome sequencing, which can identify all types of structural variants, will supersede depth-of-coverage methods. Large scale and small scale duplications and repetitive sequences remain a major obstacle.

38 Acknowledgments for exome CNV analysis
Department of Genetics Patricia Gordon Christopher Heffelfinger Murim Choi Shrikant Mane Richard Lifton Allen Bale Neuropsychiatric Genetics Program Stephan Sanders Matthew State School of Public Health, Biostatistics Division Annette Molinaro 38


Download ppt "Constitutional (germ-line) variants in hereditary conditions"

Similar presentations


Ads by Google