The Past, Present, and Future of DNA Sequencing

Slides:

Advertisements

Similar presentations

Next-Generation Sequencing: Methodology and Application

Advertisements

RNA-seq library prep introduction

High throughput sequencing Barbera van Schaik

An Introduction to Studying Expression Data Through RNA-seq

The Good, Bad, and Ugly of Next-Gen Sequencing

Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.

“BIG DATA” from RNA-Seq Experiments. Significance of RNA-Seq Approaches  Reveals which genes are expressed and the levels at which they are expressed;

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.

Next–generation DNA sequencing technologies – theory & practice

Current Sequencing Technologies and Data Generation

Peter Tsai Bioinformatics Institute, University of Auckland

Next-generation sequencing

The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.

Canadian Bioinformatics Workshops

Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.

Greg Phillips Veterinary Microbiology

Transcriptomics Jim Noonan GENE 760.

The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.

Central Dogma Information storage in biological molecules DNA RNA Protein transcription translation replication.

High Throughput Sequencing

MCB 7200: Molecular Biology

mRNA-Seq: methods and applications

11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.

Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.

I. defining genomes and their contents 2002: Sanger sequencing technology (few, long reads) EST projects (random cDNA clones) cloned gDNA (cosmids, fosmids,

Update on Next-Generation Sequencing

Next generation sequencing Xusheng Wang 4/29/2010.

Sequencing Technologies and Applications at JGI

TOPICS IN (NANO) BIOTECHNOLOGY Lecture 7 5th May, 2006 PhD Course.

Molecular Microbial Ecology

ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees

Library Preparation Application dependant, using standard molecular biological techniques. Fragment library oligo kit: (per library)$35 GeneAmp dNTP blend:

Genomics – Next-Gen sequencing and Microarrays

Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.

Next Generation DNA Sequencing

Literature reviews revised is due4/11 (Friday) turn in together: revised paper (with bibliography) and peer review and 1st draft.

The iPlant Collaborative

How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.

Next-Generation Sequencing of Microbial Genomes and Metagenomes

MCB 720: Molecular Biology Biotechnology terminology Common hosts in biotechnology research Transcription & Translation Prokaryotic gene organization &

SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.

Sequencing DNA 1. Maxam & Gilbert's method (chemical cleavage) 2. Fred Sanger's method (dideoxy method) 3. AUTOMATED sequencing (dideoxy, using fluorescent.

Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.

MCB 720: Molecular Biology Biotechnology terminology Common hosts in biotechnology research Transcription & Translation Prokaryotic gene organization &

MCB 7200: Molecular Biology Biotechnology terminology Common hosts and experimental organisms Transcription and translation Prokaryotic gene organization.

Introduction to RNAseq

Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.

No reference available

From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).

CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.

Library QA & QC Day 1, Video 3

16S rRNA Experimental Design

Research Techniques Made Simple: Next-Generation Sequencing:

Short Read Sequencing Analysis Workshop

Next generation sequencing

RNA-Seq for the Next Generation RNA-Seq Intro Slides

Microbial Genomes and techniques for studying them.

Cancer Genomics Core Lab

Gene expression from RNA-Seq

UC Denver Genomics & Microarray Cores Advances in Illumina Technology

Next Generation Sequencing

Teagasc/APC Sequencing Facility

Sequencing Data Analysis

Biotechnology and Genetic Engineering PBIO 450/550

RNA sequencing (RNA-Seq) and its application in ovarian cancer

BF nd (Next) Generation Sequencing

Schematic representation of a transcriptomic evaluation approach.

Sequence Analysis - RNA-Seq 1

Sequencing Data Analysis

Presentation transcript:

The Past, Present, and Future of DNA Sequencing Craig A. Praul Co- Director Genomics Core Facility Huck Institutes of the Life Sciences Penn State University

A very short history of DNA sequencing

I started from the conviction that, if different DNA species exhibited different biological activities, there should also exist chemically demonstrable differences between deoxyribonucleic acids. Edwin Chargaff

Milestones First Isolation of DNA : 1867 (Freidrich Meisher) Composition of nucleic acids; tetranucleotide theory : 1909 - 1940 (Phoebus Levine) G=C and A=T however, the G/C and A/T content of different organisms vary : 1950 (Edwin Chargaff) G/C content measured by annealing : 1968 (Mandel and Marmur) Maxam-Gilbert and Sanger Sequencing : 1977 Next-Generation Sequencing : 2005

Genomes Sequenced Virus – 3222 (Bacteriophage phi X 174, 5386 nt – 1977) Bacteria – 2289 (Haemophilus influenza, 1.8 x 106 nt – 1995) Eukarya – 168 (S. cerevisiae 1.2 x 107 nt – 1995; H. sapien, 3 x 109 nt -2001) Archaea – 152 (Methanococcus jannaschi , 1.7 x 106 nt – 1996)

Next-Generation Sequencing Liu et al. Journal of Biomedicine and Biotechnology Volume 2012 (2012), Article ID 251364, 11 pages doi:10.1155/2012/251364

Changes in instrument capacity* ER Mardis. Nature 470, 198-203 (2011) doi:10.1038/nature09796

Sequencing Cost Date Cost per Mb Cost per Genome Sep-01 $5,292.39 $95,263,072 Sep-02 $3,413.80 $61,448,422 Oct-03 $2,230.98 $40,157,554 Oct-04 $1,028.85 $18,519,312 Oct-05 $766.73 $13,801,124 Oct-06 $581.92 $10,474,556 Oct-07 $397.09 $7,147,571 Oct-08 $3.81 $342,502 Oct-09 $0.78 $70,333 Oct-10 $0.32 $29,092 Oct-11 $0.09 $7,743 Oct-12 $0.07 $6,618 Jan-13 $0.06 $5,671 Source - NHGRI : http://www.genome.gov/sequencingcosts/

Central Dogma of Molecular Biology James Watson version - 1965 RNA Protein DNA So once we have the genomic DNA sequence of a species we have all of the information there is? Really?

No, not really.

Illumina HiSeq and MiSeq Massively parallel HiSeq : 150 or 180 million reads per lane MiSeq : 15 million reads per run Intermediate Read Length HiSeq : 100 nt or 150 nt MiSeq : 250 nt High total output per run HiSeq : 90 GB or 288 GB MiSeq : 8 GB

Sequencing Types Single Read Paired-end read Mate-pair read

Library Types Many different library preps : DNA, mate-pair, mRNA, miRNA, ChIP Fragmentation DNA : 300 – 500 nt RNA : 150 – 200 nt Attachment of appropriate adapters Complex : flow cell binding, F & R sequencing, BC Custom : Avoid if possible Removal of dimers/small inserts Amplification (or not)

Applications de Novo sequencing (genomes, transcriptomes) Resequencing (genomes, exomes, custom sequence capture) RNA-seq (mRNA, miRNA, degradome) Chip-Seq Methyl-seq RIP-seq Amplicon

de Novo Experimental Design Estimate of genome size Coverage (30 x – 100 x) Sequencing Type (paired-end or mate-pair) Example 100 MB genome, 100 x 100 nt paired-end reads (100 MB) x (30 x coverage) = 3 GB 3 GB / (200 nt for each pair of paired-end reads) = 15 million read pairs Replicates

Resequencing : Sequence Capture

RNA-seq Experimental Design Estimate of transcriptome size (1-5% of genome ?) Coverage (30 x ?) mRNA or rRNA depleted RNA Relative abundance of transcripts you are interested in Sequencing Type (single read or paired-end) Simple transcriptome vs. complex transcriptome Splice variants Example 3 GB genome, 100 nt single reads (3 GB genome) x ( 5% transcriptome ) = 120 MB Transcriptome (120 MB transcriptome) x (30 x coverage) = 4.5 GB total sequence 4.5 GB / (100 nt for each read) = 45 million read pairs Replicates : Yes!!!! Biological not technical

ChIP-Seq http://www.nature.com/nmeth/journal/v4/n8/images/nmeth0807-613-F1.gif

RIP-seq Source : http://openi.nlm.nih.gov/imgs/rescaled512/3269675_ijms-13-00097f6.png

Methyl-seq 20 different types of base modifications in DNA are known and there are perhaps 200 modifications of RNA

Experimental Space: Next-Gen Platform PacBio : 0.075 x 106 reads/sample, 1000 – 3000 nt Whole transcript Roche 454 FLX+ : 0.5 -1 x 106 reads/sample, 800 -1000 nt Small – Medium Genome de novo sequencing Long Amplicon Transcriptome PGM: 1-2 x 106 reads per sample, 400 nt Small genome de novo Medium Amplicon MiSeq: 1-2 x 106 reads per sample, 50 – 250 nt Small genome de Novo Small Amplicon HiSeq : 10-100 x 106 reads per sample, 50 – 150 nt Counting Applications : RNA-seq, ChIP-seq, RIP-seq, Methyl-seq Large genome de novo and resequencing

Experimental Space: The Relevancy of “Classic” Techniques Differential Gene Expression Northern blotting (1977) : 1 Probe – 20 samples Dot Blots (1987) : 100s of probes – 1 sample RT-PCR (1992) : 100s of probes – 10 -100 samples Microarrays (1995 ) : 100,000s of probes – 1 sample Next-gen sequencing (2005) : 10-100 x 106 reads – 1 sample

The Future More Reads Longer Reads Faster Sequencing Cheaper Sequencing New Applications