Download presentation
Presentation is loading. Please wait.
Published byMalcolm Warner Modified over 8 years ago
1
Discriminating somatic and germline mutations in tumour DNA samples without matching normals Saskia Hiltemann Erasmus Medical Center, Rotterdam MGC Symposium, September 2015 s.hiltemann@erasmusmc.nl
2
The Matched Normal Goal: Find somatic mutations - Problem 1: parts of normal genome no- called or low confidence - Problem 2: Normal tissue not always available - Problem 3: Sequencing cost doubles if you always have to sequence pairs
3
The Virtual Normal Majority of variants are polymorphic Use a set of samples from healthy, unrelated individuals as a virtual normal Advantage - No need to sequence matching normal. - More robust against missed variants in a single normal sample. Disadvantage - Cannot detect personal germline mutations.
4
Complete Genomics Diversity Panel (54 genomes) Complete Genomics in 1000 Genomes (433 genomes) GoNL (Illumina-sequenced) normals from Dutch population (498 genomes) The Virtual Normal
5
The variant annotation problem “Why not just use the already available variant databases?” - db SNP, COSMIC, 1KG, ExAC, EP6500, …. Answer: - Different descriptions of the same variant possible - Simple position-based algorithms suboptimal - There are no community standards - Context of variant lost in databases
6
VCF variant description ins T del AG sub GA->CT del C sub AG-> CT sub GA->CT snp C-> T del A snp G-> C sub GA->CT del CAGTGA ins TCTCT Example: Genome in a bottle
7
There are no standards. Even for the simple canonicalisation issue: - NGS sequencing: left-most notation. - HGVS: most 3’ notation. (= right-most alignment for forward strand genes) - Often unknown which conventions were used for variants deposited online The variant annotation problem
8
Our method uses CGATools algorithm, capable of resolving many of these equivalencies Context of variants preserved by using full individual-level samples rather than aggregated list of variants Solution to the variant annotation problem?
9
Small Variants (snp, ins, del, sub up to 50 bp in length) Results
10
Structural Variation - Even harder to compare - Horrible agreement between different tools / sequencing platforms Results
11
How many normals do we need? Results 400
12
When no matched normal is available, our VN method is a great improvement over relying solely on the public variant databases Depending on the research question, VN can replace a matched normal For optimal performance, use both MN and VN Need improved algorithms for variant comparisons Need to retain contextual information about area surrounding variants Conclusions
13
Available as a Galaxy Tool from the main tool shed 433 + 54 Complete Genomics Normal Genomes publicly available for download Ability to easily add your own (CG-sequenced) normals to the tool Tool installed on our demo Galaxy server: http://galaxy-demo.ctmm-trait.nlhttp://galaxy-demo.ctmm-trait.nl Paper recently published in Genome Research http://genome.cshlp.org/content/early/2015/07/24/gr.183053.114 http://genome.cshlp.org/content/early/2015/07/24/gr.183053.114 Availability
14
Acknowledgements Erasmus MC Andrew Stubbs Guido Jenster Peter van der Spek Jan Trapman Youri Hoogstrate Bas Horsman David van Zessen Ivo Palli Sylvia de Does Complete Genomics Rick Tearle Steve Lincoln
15
Results
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.