The Past, Present, and Future of DNA Sequencing

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

RNA-seq library prep introduction
High throughput sequencing Barbera van Schaik
An Introduction to Studying Expression Data Through RNA-seq
The Good, Bad, and Ugly of Next-Gen Sequencing
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
“BIG DATA” from RNA-Seq Experiments. Significance of RNA-Seq Approaches  Reveals which genes are expressed and the levels at which they are expressed;
 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.
Next–generation DNA sequencing technologies – theory & practice
Current Sequencing Technologies and Data Generation
Peter Tsai Bioinformatics Institute, University of Auckland
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Greg Phillips Veterinary Microbiology
Transcriptomics Jim Noonan GENE 760.
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
Central Dogma Information storage in biological molecules DNA RNA Protein transcription translation replication.
High Throughput Sequencing
MCB 7200: Molecular Biology
mRNA-Seq: methods and applications
11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
I. defining genomes and their contents 2002: Sanger sequencing technology (few, long reads) EST projects (random cDNA clones) cloned gDNA (cosmids, fosmids,
Update on Next-Generation Sequencing
Next generation sequencing Xusheng Wang 4/29/2010.
Sequencing Technologies and Applications at JGI
TOPICS IN (NANO) BIOTECHNOLOGY Lecture 7 5th May, 2006 PhD Course.
Molecular Microbial Ecology
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Library Preparation Application dependant, using standard molecular biological techniques. Fragment library oligo kit: (per library)$35 GeneAmp dNTP blend:
Genomics – Next-Gen sequencing and Microarrays
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Next Generation DNA Sequencing
Literature reviews revised is due4/11 (Friday) turn in together: revised paper (with bibliography) and peer review and 1st draft.
The iPlant Collaborative
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Next-Generation Sequencing of Microbial Genomes and Metagenomes
MCB 720: Molecular Biology Biotechnology terminology Common hosts in biotechnology research Transcription & Translation Prokaryotic gene organization &
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
Sequencing DNA 1. Maxam & Gilbert's method (chemical cleavage) 2. Fred Sanger's method (dideoxy method) 3. AUTOMATED sequencing (dideoxy, using fluorescent.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
MCB 720: Molecular Biology Biotechnology terminology Common hosts in biotechnology research Transcription & Translation Prokaryotic gene organization &
MCB 7200: Molecular Biology Biotechnology terminology Common hosts and experimental organisms Transcription and translation Prokaryotic gene organization.
Introduction to RNAseq
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
No reference available
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
Library QA & QC Day 1, Video 3
16S rRNA Experimental Design
Research Techniques Made Simple: Next-Generation Sequencing:
Short Read Sequencing Analysis Workshop
Next generation sequencing
RNA-Seq for the Next Generation RNA-Seq Intro Slides
Microbial Genomes and techniques for studying them.
Cancer Genomics Core Lab
Gene expression from RNA-Seq
UC Denver Genomics & Microarray Cores Advances in Illumina Technology
Next Generation Sequencing
Teagasc/APC Sequencing Facility
Sequencing Data Analysis
Biotechnology and Genetic Engineering PBIO 450/550
RNA sequencing (RNA-Seq) and its application in ovarian cancer
BF nd (Next) Generation Sequencing
Schematic representation of a transcriptomic evaluation approach.
Sequence Analysis - RNA-Seq 1
Sequencing Data Analysis
Presentation transcript:

The Past, Present, and Future of DNA Sequencing Craig A. Praul Co- Director Genomics Core Facility Huck Institutes of the Life Sciences Penn State University

A very short history of DNA sequencing

I started from the conviction that, if different DNA species exhibited different biological activities, there should also exist chemically demonstrable differences between deoxyribonucleic acids. Edwin Chargaff

Milestones First Isolation of DNA : 1867 (Freidrich Meisher) Composition of nucleic acids; tetranucleotide theory : 1909 - 1940 (Phoebus Levine) G=C and A=T however, the G/C and A/T content of different organisms vary : 1950 (Edwin Chargaff) G/C content measured by annealing : 1968 (Mandel and Marmur) Maxam-Gilbert and Sanger Sequencing : 1977 Next-Generation Sequencing : 2005

Genomes Sequenced Virus – 3222 (Bacteriophage phi X 174, 5386 nt – 1977) Bacteria – 2289 (Haemophilus influenza, 1.8 x 106 nt – 1995) Eukarya – 168 (S. cerevisiae 1.2 x 107 nt – 1995; H. sapien, 3 x 109 nt -2001) Archaea – 152 (Methanococcus jannaschi , 1.7 x 106 nt – 1996)

Next-Generation Sequencing Liu et al. Journal of Biomedicine and Biotechnology Volume 2012 (2012), Article ID 251364, 11 pages doi:10.1155/2012/251364

Changes in instrument capacity* ER Mardis. Nature 470, 198-203 (2011) doi:10.1038/nature09796

Sequencing Cost Date Cost per Mb Cost per Genome Sep-01 $5,292.39 $95,263,072 Sep-02 $3,413.80 $61,448,422 Oct-03 $2,230.98 $40,157,554 Oct-04 $1,028.85 $18,519,312 Oct-05 $766.73 $13,801,124 Oct-06 $581.92 $10,474,556 Oct-07 $397.09 $7,147,571 Oct-08 $3.81 $342,502 Oct-09 $0.78 $70,333 Oct-10 $0.32 $29,092 Oct-11 $0.09 $7,743 Oct-12 $0.07 $6,618 Jan-13 $0.06 $5,671 Source - NHGRI : http://www.genome.gov/sequencingcosts/

Central Dogma of Molecular Biology James Watson version - 1965 RNA Protein DNA So once we have the genomic DNA sequence of a species we have all of the information there is? Really?

No, not really.

Illumina HiSeq and MiSeq Massively parallel HiSeq : 150 or 180 million reads per lane MiSeq : 15 million reads per run Intermediate Read Length HiSeq : 100 nt or 150 nt MiSeq : 250 nt High total output per run HiSeq : 90 GB or 288 GB MiSeq : 8 GB

Sequencing Types Single Read Paired-end read Mate-pair read

Library Types Many different library preps : DNA, mate-pair, mRNA, miRNA, ChIP Fragmentation DNA : 300 – 500 nt RNA : 150 – 200 nt Attachment of appropriate adapters Complex : flow cell binding, F & R sequencing, BC Custom : Avoid if possible Removal of dimers/small inserts Amplification (or not)

Applications de Novo sequencing (genomes, transcriptomes) Resequencing (genomes, exomes, custom sequence capture) RNA-seq (mRNA, miRNA, degradome) Chip-Seq Methyl-seq RIP-seq Amplicon

de Novo Experimental Design Estimate of genome size Coverage (30 x – 100 x) Sequencing Type (paired-end or mate-pair) Example 100 MB genome, 100 x 100 nt paired-end reads (100 MB) x (30 x coverage) = 3 GB 3 GB / (200 nt for each pair of paired-end reads) = 15 million read pairs Replicates

Resequencing : Sequence Capture

RNA-seq Experimental Design Estimate of transcriptome size (1-5% of genome ?) Coverage (30 x ?) mRNA or rRNA depleted RNA Relative abundance of transcripts you are interested in Sequencing Type (single read or paired-end) Simple transcriptome vs. complex transcriptome Splice variants Example 3 GB genome, 100 nt single reads (3 GB genome) x ( 5% transcriptome ) = 120 MB Transcriptome (120 MB transcriptome) x (30 x coverage) = 4.5 GB total sequence 4.5 GB / (100 nt for each read) = 45 million read pairs Replicates : Yes!!!! Biological not technical

ChIP-Seq http://www.nature.com/nmeth/journal/v4/n8/images/nmeth0807-613-F1.gif

RIP-seq Source : http://openi.nlm.nih.gov/imgs/rescaled512/3269675_ijms-13-00097f6.png

Methyl-seq 20 different types of base modifications in DNA are known and there are perhaps 200 modifications of RNA

Experimental Space: Next-Gen Platform PacBio : 0.075 x 106 reads/sample, 1000 – 3000 nt Whole transcript Roche 454 FLX+ : 0.5 -1 x 106 reads/sample, 800 -1000 nt Small – Medium Genome de novo sequencing Long Amplicon Transcriptome PGM: 1-2 x 106 reads per sample, 400 nt Small genome de novo Medium Amplicon MiSeq: 1-2 x 106 reads per sample, 50 – 250 nt Small genome de Novo Small Amplicon HiSeq : 10-100 x 106 reads per sample, 50 – 150 nt Counting Applications : RNA-seq, ChIP-seq, RIP-seq, Methyl-seq Large genome de novo and resequencing

Experimental Space: The Relevancy of “Classic” Techniques Differential Gene Expression Northern blotting (1977) : 1 Probe – 20 samples Dot Blots (1987) : 100s of probes – 1 sample RT-PCR (1992) : 100s of probes – 10 -100 samples Microarrays (1995 ) : 100,000s of probes – 1 sample Next-gen sequencing (2005) : 10-100 x 106 reads – 1 sample

The Future More Reads Longer Reads Faster Sequencing Cheaper Sequencing New Applications