Presentation on theme: "Next Generation Sequencing – An Overview"— Presentation transcript:
1 Next Generation Sequencing – An Overview Olga Vinnere Pettersson, PhDNational Genomics Infrastructure hosted by ScilifeLab,Uppsala Node (UGC)Version b
2 Today we will talk about: National Genomics Infrastructure – SwedenHistory and current state of genomic researchSequencing technologies:TypesPrinciplesTheir “+” and “-”Couple of pieces of advise
3 DNA sequencing revolution Massively parallel sequencing (454, Illumina, Life Tech)Human genomeJames Watsons genomeTight schedule, Lunch (wrap) at level 2, Dinner at level 2, speakers hand in USB, Thanks to Anders AnderssonCenter for MetagenomicSequence Analysis (KAW)Swedish National Infrastructurefor Large-Scale Sequencing(SNISS)Science for LifeLaboratory (SciLifeLab)
6 DEFINITION“In genetics and biochemistry, sequencing means to determine the primary structure (or primary sequence) of an unbranched biopolymer.”(http://en.wikipedia.org/wiki/Sequencing)
7 Once upon a time… Fredrik Sanger and Alan Coulson Chain Termination Sequencing (1977)Nobel prize 1980Principle:SYNTHESIS of DNA is randomly TERMINATED at different pointsSeparation of fragments that are 1 nucleotide different in size
8 Sanger’s sequencing Max fragment length – 750 bp P32 labelled ddNTPs ! Lack of OH-group at 3’ position of deoxyribose!P32 labelled ddNTPsFluorescent dye terminatorsMax fragment length – 750 bp
10 Sequencing genomes using Sanger’s method Extract & purify genomic DNAFragmentationMake a clone librarySequence clonesAlign sequencies ( -> contigs -> scaffolds)Close the gapsCost/Mb=1000 $, and it takes TIME
11 At the very beginning of genome sequencing era… First genome: virus X bp (1977)First organism: Haemophilus influenzae Mb (1995)First eukaryote: Saccharomyces cerevisiae Mb (1996)First multicellular organism: Cenorhabditis elegans MB ( )First plant: Arabidopsis thaliana Mb (2000)
12 Just an interesting comparison: Human genome project, 2007Genome of Craig Wenter costs 70 mln $Sanger’s sequencingGenome of James Watson costs 2 mln $454 pyrosequencingUltimate goal: 1000 $ / individualAlmost there!
13 Paradigm change From single genes to complete genomes From single transcripts to whole transcriptomesFrom single organisms to complex metagenomic poolsFrom model organisms to the species you are studying
17 NGS technologies Company Platform Amplification Sequencing method Roche454**emPCRPyrosequencingIlluminaHiSeqMiSeqBridge PCRSynthesisLifeTechSOLiD**emPCR/ WildfireLigationIon Torrent Ion ProtonSynthesis (pH)Pacific BioscienceRSIINoneComplete genomicsNanoballsOxford Nanopore*GridIONFlowRIP technologies: Helicos, Polonator, etc.In development: Tunneling currents, nanopores, etc.
18 Differences between platforms Technology: chemistry + signal detectionRun times vary from hours to daysProduction range from Mb to GbRead length from <100 bp to > 20 KbpAccuracy per base from 0.1% to 15%Cost per base varies
19 Making a NGS library Amplification Sharing & size selection DNA QC – paramount importanceSharing & size selectionAmplificationLigation of sequencing adaptors, technology specific
20 Roche Instrument Yield and run time Read Length Error rate Error type 454 FLX+0.9 GB, 20 hrs7001%Indels454 FLX Titanium0.5 GB, 10 hrs450454 FLX Jr0.050 GB, 10 hrs400Main applications:Microbial genomics and metagenomicsTargeted resequencing
22 Illumina Instrument Yield and run time Read Length Error rate Error typeUpgrade HiSeq2500120 GB in 27h or standard run100x1000.1%SubstMiSeq540 Mb – 15 Gb(4 – 48 hours)Upp to 350x350Main applicationsWhole genome, exome and targeted reseqTranscriptome analysesMethylome and ChiPSeqRapid targeted resequencing (MiSeq)
24 Life Technologies SOLiD InstrumentYield and run timeRead LengthError rateError typeSOLiD 5500 wildfire600 GB, 8 days75x35 PE60x60 MP0.01%A-T BiasFeaturesHigh accuracy due to two-base encodingTrue paired-end chemistry - ligation from either endMate-pair librariesMain applications (currently)ChiPSeq
26 Life Technologies - Ion Torrent & Ion Proton ChipYield - run timeReadLengthPGM 3140.1 GB, 3 hrs200 – 400PGM 3160.5GB, 3 hrsPGM 3181 GB, 3 hrsP-I10 GB200Main applicationsMicrobial and metagenomic sequencingTargeted resequencingClinical sequencing
27 Ion Torrent - H+ ion-sensitive field effect transistors
28 Single-Molecule, Real-Time DNA sequencing Pacific BioscienceInstrumentYield and run timeRead LengthError rateError typeRS II500 MB/180 min SMRTCell250 bp – bp( bp)15%(on a single passage!)Insertions, randomSingle-Molecule, Real-Time DNA sequencing
37 Check list: Have others done similar work? Is your methodology sound? Sample size? Repetitions?Is there people to analyze the data?Is there computer capacity to analyze the data?Will you be able to publish NGS data by yourself?PLEASE consult the sequencing facility PRIOR to onset of your project!
38 Common pitfalls and a piece of advise: If you give us low quality DNA/RNA - expect low quality dataIf you give us too little DNA/RNA – expect biased dataDo not try to do everything by yourselfMake sure there is a dedicated bioinformatician availableNever underestimate time and money needed for data analysisGoogle often!Use online forums, e.g. SeqAnswers.com
39 Summary Progress is FAST- keep yourselves updated! Chose technology based on:What is most feasibleWhat is most accessibleWhat is most cost-effectiveSciLifeLab Genomics & Bioinformatics are here for you!
42 National Genomics Infrastructure SciLifeLab, UppsalaSciLifeLab, StockholmMid 2010Uppmax, Uppsala
43 3. Access to genomics platform Projects at CMSPortal project flow3. Access to genomics platformNGI Project coordinators meet every second day via SkypeUlrika LiljedahlSNP&SEQUppsala nodeOlga Vinnere PetterssonUGCUppsala NodeMattias OrmestadStockholm NodeProject distribution is based on:Wish of PIType of sequencing technologyType of applicationQueue at technology platformsProject is then assigned to a certain node and a coordinator contacts the PI
44 NGI Equipment One of 5 best-equipped NGS sites in Europe Illumina HiSeq 2000/Illumina MiSeqLife Technologies SOLiD 5500wildfire 1Life Technologies Ion Torrent 2Life Technologies Ion Proton 6Life Technologies Sanger ABIPacific Biosciences RSIIArgus Whole Genome Mapping System 1One of 5 best-equipped NGS sites in Europe
45 3. Access to genomics platform Projects at CMSProject meeting3. Access to genomics platformWhat we can help you with:Design your experiment based on the scientific question.Chose the best suited application for your project.Find the most optimal sequencing setup.Answer all questions about our technologies and applications, as well as bioinformatics.Get UPPNEX account if you do not have one.In special cases, we can give extra-support with bioinformatics analysis – development of novel methods and applications
46 Downstream Data Analysis Bioinformatics competence IS NOT present in research groupBioinformatics competence IS present in research groupCooperation with platform personnel: R&DBILS:Bioinformatics Infrastructurefor LifeSciencesWABI:WallenbergAdvancedBioinformaticsInitiativeCo-authorshipShort-term commitmentLong-term commitment