Presentation on theme: "Next Generation Sequencing – An Overview"— Presentation transcript:
1Next Generation Sequencing – An Overview Olga Vinnere Pettersson, PhDNational Genomics Infrastructure hosted by ScilifeLab,Uppsala Node (UGC)Version b
2Today we will talk about: National Genomics Infrastructure – SwedenHistory and current state of genomic researchSequencing technologies:TypesPrinciplesTheir “+” and “-”Couple of pieces of advise
3DNA sequencing revolution Massively parallel sequencing (454, Illumina, Life Tech)Human genomeJames Watsons genomeTight schedule, Lunch (wrap) at level 2, Dinner at level 2, speakers hand in USB, Thanks to Anders AnderssonCenter for MetagenomicSequence Analysis (KAW)Swedish National Infrastructurefor Large-Scale Sequencing(SNISS)Science for LifeLaboratory (SciLifeLab)
6DEFINITION“In genetics and biochemistry, sequencing means to determine the primary structure (or primary sequence) of an unbranched biopolymer.”(http://en.wikipedia.org/wiki/Sequencing)
7Once upon a time… Fredrik Sanger and Alan Coulson Chain Termination Sequencing (1977)Nobel prize 1980Principle:SYNTHESIS of DNA is randomly TERMINATED at different pointsSeparation of fragments that are 1 nucleotide different in size
8Sanger’s sequencing Max fragment length – 750 bp P32 labelled ddNTPs ! Lack of OH-group at 3’ position of deoxyribose!P32 labelled ddNTPsFluorescent dye terminatorsMax fragment length – 750 bp
10Sequencing genomes using Sanger’s method Extract & purify genomic DNAFragmentationMake a clone librarySequence clonesAlign sequencies ( -> contigs -> scaffolds)Close the gapsCost/Mb=1000 $, and it takes TIME
11At the very beginning of genome sequencing era… First genome: virus X bp (1977)First organism: Haemophilus influenzae Mb (1995)First eukaryote: Saccharomyces cerevisiae Mb (1996)First multicellular organism: Cenorhabditis elegans MB ( )First plant: Arabidopsis thaliana Mb (2000)
12Just an interesting comparison: Human genome project, 2007Genome of Craig Wenter costs 70 mln $Sanger’s sequencingGenome of James Watson costs 2 mln $454 pyrosequencingUltimate goal: 1000 $ / individualAlmost there!
13Paradigm change From single genes to complete genomes From single transcripts to whole transcriptomesFrom single organisms to complex metagenomic poolsFrom model organisms to the species you are studying
17NGS technologies Company Platform Amplification Sequencing method Roche454**emPCRPyrosequencingIlluminaHiSeqMiSeqBridge PCRSynthesisLifeTechSOLiD**emPCR/ WildfireLigationIon Torrent Ion ProtonSynthesis (pH)Pacific BioscienceRSIINoneComplete genomicsNanoballsOxford Nanopore*GridIONFlowRIP technologies: Helicos, Polonator, etc.In development: Tunneling currents, nanopores, etc.
18Differences between platforms Technology: chemistry + signal detectionRun times vary from hours to daysProduction range from Mb to GbRead length from <100 bp to > 20 KbpAccuracy per base from 0.1% to 15%Cost per base varies
19Making a NGS library Amplification Sharing & size selection DNA QC – paramount importanceSharing & size selectionAmplificationLigation of sequencing adaptors, technology specific
20Roche Instrument Yield and run time Read Length Error rate Error type 454 FLX+0.9 GB, 20 hrs7001%Indels454 FLX Titanium0.5 GB, 10 hrs450454 FLX Jr0.050 GB, 10 hrs400Main applications:Microbial genomics and metagenomicsTargeted resequencing
22Illumina Instrument Yield and run time Read Length Error rate Error typeUpgrade HiSeq2500120 GB in 27h or standard run100x1000.1%SubstMiSeq540 Mb – 15 Gb(4 – 48 hours)Upp to 350x350Main applicationsWhole genome, exome and targeted reseqTranscriptome analysesMethylome and ChiPSeqRapid targeted resequencing (MiSeq)
26Life Technologies - Ion Torrent & Ion Proton ChipYield - run timeReadLengthPGM 3140.1 GB, 3 hrs200 – 400PGM 3160.5GB, 3 hrsPGM 3181 GB, 3 hrsP-I10 GB200Main applicationsMicrobial and metagenomic sequencingTargeted resequencingClinical sequencing
27Ion Torrent - H+ ion-sensitive field effect transistors
28Single-Molecule, Real-Time DNA sequencing Pacific BioscienceInstrumentYield and run timeRead LengthError rateError typeRS II500 MB/180 min SMRTCell250 bp – bp( bp)15%(on a single passage!)Insertions, randomSingle-Molecule, Real-Time DNA sequencing
37Check list: Have others done similar work? Is your methodology sound? Sample size? Repetitions?Is there people to analyze the data?Is there computer capacity to analyze the data?Will you be able to publish NGS data by yourself?PLEASE consult the sequencing facility PRIOR to onset of your project!
38Common pitfalls and a piece of advise: If you give us low quality DNA/RNA - expect low quality dataIf you give us too little DNA/RNA – expect biased dataDo not try to do everything by yourselfMake sure there is a dedicated bioinformatician availableNever underestimate time and money needed for data analysisGoogle often!Use online forums, e.g. SeqAnswers.com
39Summary Progress is FAST- keep yourselves updated! Chose technology based on:What is most feasibleWhat is most accessibleWhat is most cost-effectiveSciLifeLab Genomics & Bioinformatics are here for you!
42National Genomics Infrastructure SciLifeLab, UppsalaSciLifeLab, StockholmMid 2010Uppmax, Uppsala
433. Access to genomics platform Projects at CMSPortal project flow3. Access to genomics platformNGI Project coordinators meet every second day via SkypeUlrika LiljedahlSNP&SEQUppsala nodeOlga Vinnere PetterssonUGCUppsala NodeMattias OrmestadStockholm NodeProject distribution is based on:Wish of PIType of sequencing technologyType of applicationQueue at technology platformsProject is then assigned to a certain node and a coordinator contacts the PI
44NGI Equipment One of 5 best-equipped NGS sites in Europe Illumina HiSeq 2000/Illumina MiSeqLife Technologies SOLiD 5500wildfire 1Life Technologies Ion Torrent 2Life Technologies Ion Proton 6Life Technologies Sanger ABIPacific Biosciences RSIIArgus Whole Genome Mapping System 1One of 5 best-equipped NGS sites in Europe
453. Access to genomics platform Projects at CMSProject meeting3. Access to genomics platformWhat we can help you with:Design your experiment based on the scientific question.Chose the best suited application for your project.Find the most optimal sequencing setup.Answer all questions about our technologies and applications, as well as bioinformatics.Get UPPNEX account if you do not have one.In special cases, we can give extra-support with bioinformatics analysis – development of novel methods and applications
46Downstream Data Analysis Bioinformatics competence IS NOT present in research groupBioinformatics competence IS present in research groupCooperation with platform personnel: R&DBILS:Bioinformatics Infrastructurefor LifeSciencesWABI:WallenbergAdvancedBioinformaticsInitiativeCo-authorshipShort-term commitmentLong-term commitment