High Throughput Sequencing

1 High Throughput Sequencing
Tutorial 6 High Throughput Sequencing

2 HTS tools and analysis Visualization - IGV Analysis platform – Galaxy
Tuning up the pipelines

3 Working with IGV









12 Why and how to work with IGV

13 Base qualities, comparison between samples

14 False positive indels

15 Same mapping statistics – different meaning
What might cause this low percentage of mapping?

16 The sample contains a high percentage of contamination
The sample is very different from the reference genome

17 One image is worth a thousand words…

18 Structural Variations
Large deletion in the sample compared to the reference genome

19 Galaxy


21 Use your account name and password to login to Galaxy:

22 Uploading data to Galaxy






28 Mapping, filtering and conversion to BAM

29 Mapping

30 Filter SAM file

31 Convert SAM to BAM

32 Variant calling

33 Create pileup

34 Find variants

35 Tuning up the pipelines

36 How can mapping parameters
affect the results 1 mismatch per read 5 mismatches per read

37 False positives vs. true negatives
One pipeline for all projects? False positives vs. true negatives 3-bases insertion

38 How can you tune your analysis?
Try different programs. Mapping: Change mapping parameters Use non-unique mappings Don’t filter duplicates Variants: Change variant filtration Change variant merging – penetrance, different heredity, low coverage in one individual… Look for bigger variants: big insertions/ deletions, inversions, copy number variations etc. Gene expression: Change the test threshold

