Download presentation
Presentation is loading. Please wait.
Published byΜάξιμος Μανιάκης Modified over 6 years ago
1
Assessment of HaloPlex Amplification for Sequence Capture and Massively Parallel Sequencing of Arrhythmogenic Right Ventricular Cardiomyopathy–Associated Genes Anna Gréen, Henrik Gréen, Malin Rehnberg, Anneli Svensson, Cecilia Gunnarsson, Jon Jonasson The Journal of Molecular Diagnostics Volume 17, Issue 1, Pages (January 2015) DOI: /j.jmoldx Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
2
Figure 1 Bioinformatics overview. In pipeline 1 (the recommended bioinformatics pipeline from the HaloPlex development team), two versions of Burrows-Wheeler Aligner (BWA) for paired-end analysis were evaluated for comparison: bwa sampe and bwa mem. In pipeline 2, the data were analyzed using SureCall software. The adaptor removal algorithm in SureCall is not known to us and, thereby, is indicated by a question mark. In the version of SureCall used herein, bwa was used as aligner and SAMtools as variant caller. Pipeline 3, composed of custom scripts relying on bwa mem single-end analysis and SAMtools , is described in detail in Materials and Methods with code provided in Supplemental Code S1. GATK, Genome Analysis Toolkit. The Journal of Molecular Diagnostics , 31-42DOI: ( /j.jmoldx ) Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
3
Figure 2 Quality of sequencing reads. FastQC per base quality scores for arrhythmogenic right ventricular cardiomyopathy panel design 1, where A represents a typical example of forward reads and B the reverse reads. A quality score of 10, 20, or 30 means that the probability that the base is called wrong is 10%, 1%, or 0.1%, respectively. Base quality scores are generally >Q30, with a slight drop at the end of each read. The central red line indicates the median value; the yellow boxes, the inter-quartile range (25% to 75%); the upper and lower whiskers, the 10% and 90% points; the blue line, the mean quality. The Journal of Molecular Diagnostics , 31-42DOI: ( /j.jmoldx ) Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
4
Figure 3 Sequencing quality in Context-Specific Sequencing Error motif. Integrative Genomics Viewer screenshot of a .vcf file containing called variants (top track) and a .bam file containing actual reads aligned to hg19 (middle track) for part of DSP exon 1 in a arrhythmogenic right ventricular cardiomyopathy-patient. Bases deviating from the hg19 reference are color coded. A false homozygous variant (insertion) was found in the .vcf file (top track). In the middle track, the actual reads are visualized in the .bam file. The presence of reads of variable length was untypical for amplicon-based assays. Also, the many faint color-coded bases demonstrated the presence of erroneous bases of low quality. The Journal of Molecular Diagnostics , 31-42DOI: ( /j.jmoldx ) Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
5
Figure 4 Nonrandom sequencing errors in the presumptive Context-Specific Sequencing Errrors motif of DSP exon 1 from an arrhythmogenic right ventricular cardiomyopathy patient. Bwa mem was used for alignment of the HaloPlex reads to hg19. The figure shows the chaotic results as visualized in the SAMtools mpileup format. The first line in each block consists of chromosome, 1-based coordinate, reference base, and the number of reads covering the site. The second line includes read bases and the third line base qualities. The dots and commas in the read bases string represent reads with the reference nucleotide in the forward and reverse direction, respectively. Capital and lowercase letters represent reads with an alternative base in the forward and reverse direction, respectively. The third line shows the Phred score quality of bases, ie, Q scores, which are logarithmically linked to error probabilities and coded by ASCII numbers minus 33. For example, the ASCII of the character is 39. Hence, a corresponding base on the line above has a Q score of 39 – 33 = 6, which means that there is a 25% chance that it is a false call. Statistically, a “c” instead of “t” at chr6: (in the middle of the figure) should have a fair chance of being correct, despite the low Q scores. However, every “c” on that line represents a false call. The Journal of Molecular Diagnostics , 31-42DOI: ( /j.jmoldx ) Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology Terms and Conditions
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.