Presentation is loading. Please wait.

Presentation is loading. Please wait.

The past, present, and future of DNA sequencing Dan Russell.

Similar presentations


Presentation on theme: "The past, present, and future of DNA sequencing Dan Russell."— Presentation transcript:

1

2 The past, present, and future of DNA sequencing Dan Russell

3 Overview Prologue: Assembly and Finishing The Past: Sanger The Present: Next-Gen (454, Illumina, …) The Future: ? (Nanopore, MinION, Single-molecule)

4 Overview Prologue: Assembly and Finishing The Past: Sanger The Present: Next-Gen (454, Illumina, …) The Future: ? (Nanopore, MinION, Single-molecule)

5 MethodRead Length Sanger bp bp Illumina~100 bp Ion Torrent~200 bp But… Phage Genome: 30,000 to 500,000 bp Bacteria: Several million bp Human: 3 billion bp

6 Shotgun Genome Sequencing Complete genome copiesFragmented genome chunks

7 Shotgun Genome Sequencing Fragmented genome chunks NOT REALLY DONE BY DUCK HUNTERS Hydroshearing, sonication, enzymatic shearing NOT REALLY DONE BY DUCK HUNTERS Hydroshearing, sonication, enzymatic shearing

8 All the Kings horses and all the Kings men… ATTGTTCCCACAGAC CG CGGCGAAGCATTGT TCC ACCGTGTTTTCCGA CCG TTTCCGACCGAAATG GC TTGTTCCCACAGACC GTG AGCTCGATGCCGGCG AAG ATGCCGGCGAAGCAT TGT TAATGCGACCTCGATG CC ACAGACCGTGTTTCC CGA AAGCATTGTTCCCAC AG TGTTTTCCGACCGAA AT CCGACCGAAATGGC TCC TGCCGGCGAAGCCT TGT Assembly, aka

9 Dans recommended assemblers Your Sequencing TechnologyRecommended Assembler SangerphredPhrap Ion Torrent/454Newbler Illuminavelvet REGARDLESS OF ASSEMBLY PROGRAM, ID RECOMMEND USING CONSED FOR FINISHING!

10 Some words have special meanings in scientific context THEORY FINISH Special use of the word finish Before annotation, phage genomes should be sequenced AND finished.

11 What is finishing?

12 When we put all the reads back together this time: GAP! But now we at least know the sequence on each side, so we can design primers to run a sequencing reaction towards the gap, and hopefully connect our contigs. What is finishing?

13

14 A combination of computer and wet-bench work to ensure that the entire genome sequence is present and that all bases are high quality.

15 1.Shotgun sequencing to generate reads 2.Assembly of reads 3.Identification of weak areas 4.Targeted sequencing runs to fix 5.Verification of finished sequence 6.Generation of final fasta file Done for all phages sequenced at Pitt Done by most independent seq facilities NOT DONE by most seq facilities From DNA to Annotatable Sequence

16 1.Shotgun sequencing to generate reads 2.Assembly of reads 3.Identification of weak areas 4.Targeted sequencing runs to fix 5.Verification of finished sequence 6.Generation of final fasta file Done for all phages sequenced at Pitt Done by most independent seq facilities NOT DONE by most seq facilities = FINISHING From DNA to Annotatable Sequence

17 Overview Prologue: Assembly and Finishing The Past: Sanger The Present: Next-Gen (454, Illumina, …) The Future: ? (Nanopore, MinION, Single-molecule)

18 Fragments were cloned:

19

20 Sanger Sequencing Reactions For given template DNA, its like PCR except: Uses only a single primer and polymerase to make new ssDNA pieces. Includes regular nucleotides (A, C, G, T) for extension, but also includes dideoxy nucleotides. A A A A A A A G A T C C C C C C C T T T T T G G G G G G Regular Nucleotides Dideoxy Nucleotides A A A A AT C C C T T T T G G G G G 1.Labeled 2.Terminators

21 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer G T C T T G G G C T

22 Sanger Sequencing G T C T T G G G C T A G C G C A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer G T C T T G G G C T 5 T G C G C G G C C C A 21 bp

23 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A Primer G T C T T G G G C T A

24 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp 5 T G C G C G G C C C A Primer G

25 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp 5 T G C G C G G C C C A G 12 bp 5 T G C G C G G C C C A Primer G T C T T G G G C

26 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp 5 T G C G C G G C C C A G 12 bp 5 T G C G C G G C C C A G T C T T G G G C 20 bp 5 T G C G C G G C C C A Primer G T C T T

27 Sanger Sequencing A C G C G C C G G G T C A G A A C C C G A T C G C G 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp 5 T G C G C G G C C C A G 12 bp 5 T G C G C G G C C C A G T C T T G G G C 20 bp 5 T G C G C G G C C C A G T C T T 16 bp

28 Sanger Sequencing A C G C G C C G G G T ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 53 G T C T T G G G C T A G C G C 5 T G C G C G G C C C A G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 26 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp 5 T G C G C G G C C C A G 12 bp 5 T G C G C G G C C C A G T C T T G G G C 20 bp 5 T G C G C G G C C C A G T C T T 16 bp

29 5 T G C G C G G C C C A G T C T T G G G 19 bp 5 T G C G C G G C C C A G T C T T G G G C T A 22 bp Sanger Sequencing G T C T T G G G C T 5 T G C G C G G C C C A 21 bp 5 T G C G C G G C C C A G T C T T G G G C 20 bp 5 T G C G C G G C C C A G 12 bp 5 T G C G C G G C C C AG T 13 bp 5 T G C G C G G C C C A G T C T T 16 bp 5 T G C G C G G C C C AG T C 14 bp 5 T G C G C G G C C C A G T C T 15 bp 5 T G C G C G G C C C A G T C T T G 17 bp 5 T G C G C G G C C C A G T C T T G G 18 bp Laser Reader

30 Sanger Sequencing Output Each sequencing reaction gives us a chromatogram, usually ~ bp:

31 Sanger Throughput Limitations Must have 1 colony picked for every 2 reactions Must have 1 PCR tube for each reaction Must have 1 capillary for each reaction from The Economist Improvements in cost from making Sanger higher throughput Improvements in cost from Next-Gen sequencing technologies

32 Overview Prologue: Assembly and Finishing The Past: Sanger The Present: Next-Gen (454, Illumina, …) The Future: ? (Nanopore, MinION, Single-molecule)

33 Shotgun sequencing by Ion Torrent Personal Genome Machine and 454

34 Genomic Fragment Adapters Shotgun sequencing by PGM/454

35 Genomic Fragment Barcode

36 Shotgun sequencing by PGM/454

37 Bead/ISP Adapter Complement Sequences The idea is that each bead should be amplified all over with a SINGLE library fragment.

38 Shotgun sequencing by PGM/454

39

40

41

42

43

44

45

46

47

48

49 ~3.5 µm for Ion Torrent, ~30 µm for 454

50 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G T T T T T Shotgun sequencing by PGM/454

51 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G C C C C C Shotgun sequencing by PGM/454

52 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G A A A A A Shotgun sequencing by PGM/454

53 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G G G G G G G Shotgun sequencing by PGM/454

54 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G G T T T T T T Shotgun sequencing by PGM/454

55 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT C C C C C C Shotgun sequencing by PGM/454

56 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT C A A A A A Shotgun sequencing by PGM/454

57 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT C G G G G G Shotgun sequencing by PGM/454

58 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT C T T T T T T Shotgun sequencing by PGM/454

59 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT CT C C C C C Shotgun sequencing by PGM/454

60 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT CT A A A A A Shotgun sequencing by PGM/454

61 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT CT G G G G G G G G The real power of this method is that it can take place in millions of tiny wells in a single plate at once. Shotgun sequencing by PGM/454

62 A C G C G C C G G G T C A G A A C C C G A T C G C G 53 5 T G C G C G G C C C A Primer Only give polymerase one nucleotide at a time: If that nucleotide is incorporated, enzymes turn by-products into light: T C A G T C A G T C A G GT CT G G G G G G G G The real power of this method is that it can take place in millions of tiny wells in a single plate at once. Raw 454 data

63 Ion Torrent Sequencing

64 Illumina Sequencing

65 Next-Gen Sequencing Take home message: Massively Parallel 1,000 monkeys at 1,000 typewriters is nothing Were talking 100,000 to 100 million concurrent reads

66 Overview Prologue: Assembly and Finishing The Past: Sanger The Present: Next-Gen (454, Illumina, …) The Future: ? (Nanopore, MinION, Single-molecule)

67 Largely because of PHIRE and SEA-PHAGES…

68 DNA Sequencing over Time from The Economist

69

70 Single Molecule Sequencing

71

72 The MinION has been used to successfully read the genome of a lambda bacteriophage, which has 48,500-ish base pairs, twice during one pass. That's impressive, because reading 100,000 base pairs during a single DNA capture has never been managed before using traditional sequencing techniques. The operational life of the MinION is only about six hours, but during that time it can read more than 150 million base pairs. That's somewhat short of the larger human chromosomes (which contain up to 250 million base pairs), but Oxford Nanopore has also introduced GridION -- a platform where multiple cartridges can be clustered together. The company reckon that a 20-node GridION setup can sequence a complete human genome in just 15 minutes.MinION Wired

73

74 Epilogue So should we really still be sequencing more mycobacteriophage genomes? We have 250+…

75 Chimps vs. Humans Cluster A vs. Cluster B Mycobacteriophages At the DNA level… > 95% similar < 50% similar …but thats just one pair of clusters, how many are there?

76 DNA Sequencing over Time from The Economist

77 Comparing Different Technologies AdvantagesDisadvantages Lowest error rate Long read length (~750 bp) Can target a primer High cost per base Long time to generate data Need for cloning Amount of data per run Sanger Sequencing

78 Comparing Different Technologies AdvantagesDisadvantages Low error rate Medium read length (~ bp) Relatively high cost per base Must run at large scale Medium/high startup costs 454 Sequencing

79 Comparing Different Technologies AdvantagesDisadvantages Low startup costs Scalable (10 – 1000 Mb of data per run) Medium/low cost per base Low error rate Fast runs (<3 hours) New, developing technology Cost not as low as Illumina Read lengths only ~ bp so far Ion Torrent Sequencing

80 Comparing Different Technologies AdvantagesDisadvantages Low error rate Lowest cost per base Tons of data Must run at very large scale Short read length (50-75 bp) Runs take multiple days High startup costs De Novo assembly difficult Illumina Sequencing

81 Comparing Different Technologies AdvantagesDisadvantages Can use single molecule as template Potential for very long reads (several kb+) High error rate (~10-15%) Medium/high cost per base High startup costs PacBio Sequencing


Download ppt "The past, present, and future of DNA sequencing Dan Russell."

Similar presentations


Ads by Google