Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS/BioE 598AGB: Genome Assembly, part II Tandy Warnow.

Similar presentations


Presentation on theme: "CS/BioE 598AGB: Genome Assembly, part II Tandy Warnow."— Presentation transcript:

1 CS/BioE 598AGB: Genome Assembly, part II Tandy Warnow

2 nature biotechnology volume 29 number 11 november 2011

3 Supplementary Figure 1. De Bruijn graph from reads with sequencing errors. (a) A de Bruijn graph E on our set of reads with k = 4. Finding an Eulerian cycle is already a straightforward task, but for this value of k, it is trivial. (b) If TGGAGTG is incorrectly sequenced as a sixth read (in addition to the correct TGGCGTG read), then the result is a bulge in the de Brujin graph, which complicates assembly. (Supplementary materials from the Compeau, Pevzner, and Tesler paper, Nature Biotech, 2011)

4 (c) An illustration of a de Bruijn graph E with many bulges. The process of bulge removal should leave only the red edges remaining, yielding an Eulerian path in the resulting graph. (Supplementary materials from the Compeau, Pevzner, and Tesler paper, Nature Biotech, 2011)

5 (Supplementary materials from the Compeau, Pevzner, and Tesler paper, Nature Biotech, 2011)

6

7

8

9

10 N50 The N50 value is the size of the smallest contig (or scaffold) such that 50% of the genome is contained in contigs of size N50 or larger. This is the standard metric used to evaluate the quality of an assembly. Salzberg et al. computed “corrected N50” values by splitting contigs (or scaffolds) where errors are identified.

11

12

13

14

15

16

17

18

19

20

21

22 From Mihai Pop’s paper

23 Differing Conclusions Compeau et al.: “De Bruijn graphs are not a cure-all…Short read sequencing technologies …favor the use of de Bruijn graphs...and are also well suited to representing genomes with repeats. However, if a future sequencing technology produces high quality reads with tens of thousands of bases,…,the pendulum could swing back toward favoring overlap- based approaches for assembly.”

24 Mihai Pop’s conclusion

25 Salzberg’s conclusions

26

27


Download ppt "CS/BioE 598AGB: Genome Assembly, part II Tandy Warnow."

Similar presentations


Ads by Google