Presentation is loading. Please wait.

Presentation is loading. Please wait.

Canadian Bioinformatics Workshops

Similar presentations


Presentation on theme: "Canadian Bioinformatics Workshops"— Presentation transcript:

1 Canadian Bioinformatics Workshops www.bioinformatics.ca

2 2Module #: Title of Module

3 Module 4 Visual Analysis of HT-seq data

4 Module 4 bioinformatics.ca Learning Objectives of Module to appreciate the different data viz tools in genomics to know when to use a particular tool to gain more experience with genome browsers to become an expert in variation inspection – single nucleotide and structural variants to become familiar with next-gen variant analysis tools

5 Module 4 bioinformatics.ca Organization Part I (9:00-10:30) – genome browsers – visualizing single nucleotide and structural variants Part II (11:00-12:30) – variant search engines – finding disease-causing genetic mutations

6 Module 4 bioinformatics.ca Part I : browsing HT-seq data, inspecting variants

7 Module 4 bioinformatics.ca Why visualize our data?

8 Module 4 bioinformatics.ca Anscombe’s quartet each of these datasets has the same mean and variance

9 Module 4 bioinformatics.ca Preattentive processing encoded properly, outliers are easily identified

10 Module 4 bioinformatics.ca Preattentive processing (video)

11 Module 4 bioinformatics.ca Why visualize? the human visual system is a low-cost* and high- performance – sense maker, to identify patterns – debugger, to identify issues and outliers * compared to cost of writing, debugging, and running computational scripts

12 Module 4 bioinformatics.ca Visualization Tools in Genomics

13 Module 4 bioinformatics.ca Which tool to use? there are over 40 different genome browsers, which to use? depends on – task at hand – kind and size of data – data privacy

14 Module 4 bioinformatics.ca HT-seq Genome Browsers task at hand : visualizing HT-seq reads, especially good for inspecting previously identified variants kind and size of data : large BAM files, stored locally or remotely data privacy : run on the desktop, can keep all data private Integrative Genome Viewer Savant Genome Browser

15 Module 4 bioinformatics.ca You might also want to try New web-technologies are being applied to make HT-seq data browsing more interactive UCSC Genome Browser has been retrofitted to display BAM files Trackster is a genome browser that can perform visual analytics on small windows of the genome, deploy full analysis with Galaxy UCSC Genome Browser Trackster (part of Galaxy)

16 Module 4 bioinformatics.ca Savant desktop genome browser, designed for HT-seq data – emphasis on manually inspecting single nucleotide and structural variations

17 Module 4 bioinformatics.ca Review: structural variation detection covered in Module 3 two complementary approaches: – depth of coverage (DOC) – paired end mapping (PEM)

18 Module 4 bioinformatics.ca PEM: small insertions donor reference

19 Module 4 bioinformatics.ca PEM: large insertions donor reference

20 Module 4 bioinformatics.ca PEM: deletions reference donor

21 Module 4 bioinformatics.ca PEM: inversions reference donor one read inverted when mapped

22 Module 4 bioinformatics.ca PEM: tandem duplications reference donor order of read mappings reversed

23 Module 4 bioinformatics.ca Structural Variants in Savant Savant has a visualization mode for BAM files called “Matepair (Arc)” that is specialized for identifying structural variants using the PEM methodology it connects the locations of paired mappings by an arc – arc height represents the mapped distance – arc color represents the relative orientation of the reads (for complex rearrangements, like inverstions)

24 Module 4 bioinformatics.ca Savant demo

25 Module 4 bioinformatics.ca Lab Time

26 Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session

27 Canadian Bioinformatics Workshops www.bioinformatics.ca

28 28Module #: Title of Module

29 Module 4 Visual Analysis of HT-seq data

30 Module 4 bioinformatics.ca Quiz for Module 4 Part I

31 Module 4 bioinformatics.ca Question 1 which visualization mode in Savant is best for finding SNPs? why?

32 Module 4 bioinformatics.ca Question 2 which visualization mode in Savant is best for finding structural variations? why?

33 Module 4 bioinformatics.ca Question 3 e.g. chr1: 5,195,017 - 5,199,144 what kind of event does this image depict?

34 Module 4 bioinformatics.ca A: INSERTION donor reference

35 Module 4 bioinformatics.ca Question 4 what kind of event does this image depict? chr1: 26,489,321 - 26,490,661

36 Module 4 bioinformatics.ca A: DELETION reference donor

37 Module 4 bioinformatics.ca Question 5 what would a heterozygous deletion look like? chr1: 31,574,172 - 31,578,242

38 Module 4 bioinformatics.ca Question 6 what kind of event does this image depict? chr1: 81,659,802 - 81,661,916

39 Module 4 bioinformatics.ca A: Inversion reference donor one read inverted when mapped

40 Module 4 bioinformatics.ca Question 7 what kind of event does this image depict? chr1: 11,050,416 - 11,055,457

41 Module 4 bioinformatics.ca A: Tandem Duplication reference donor order of read mappings reversed

42 Module 4 bioinformatics.ca Part II : visual analytics for variants this is bonus material, covered if time permits contact mfiume@cs.toronto.edu for questionsmfiume@cs.toronto.edu

43 Module 4 bioinformatics.ca Genetic Variant Analysis finding disease-causing genetic mutation is “like trying to find a needle in a haystack needlestack” lots of variants many distractors – many false positives errors in sequencing errors in variant prediction – most true positives are not causal not related to phenotype of interest, not damaging

44 Module 4 bioinformatics.ca Genetic Variant Analysis filter variants based on quality, effect, and relevance to disease variant calling annotationfiltrationvisualization Modules 1-3Module 4.1

45 Module 4 bioinformatics.ca Existing Tools command-line is powerful but not interactive Excel / Genome Browsers are interactive but not powerful

46 Module 4 bioinformatics.ca chr1 : 102,435,394 – 129,485,349 GO

47 Module 4 bioinformatics.ca MedSavant, a variant search engine

48 Module 4 bioinformatics.ca MedSavant visual analytics from variant calling to disease mutation discovery variant calling annotationfiltrationvisualization MedSavant

49 Module 4 bioinformatics.ca MedSavant demo

50 Module 4 bioinformatics.ca You might also want to try VarSifter works in memory, good for small projects this space is evolving; difficult to do a comprehensive comparison much more commercial activity compared to genome browsers VarSifterGolden Helix SVS (commercial)

51 Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session


Download ppt "Canadian Bioinformatics Workshops"

Similar presentations


Ads by Google