Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome Biology for Programmers Lecture Series: Illumina Sequencing

Similar presentations


Presentation on theme: "Genome Biology for Programmers Lecture Series: Illumina Sequencing"— Presentation transcript:

1 Genome Biology for Programmers Lecture Series: Illumina Sequencing
Chris Daum JGI Illumina Group Lead April 1, 2011

2 Outline Workflow Overview Process Science
Sample Prep & qPCR quantification Cluster Generation Sequencing Sequencer instruments: GA & HiSeq Illumina Developments Illumina quality & continuous improvement

3 Illumina Workflow Analysis Clustering Sequencing Analysis Sample
Preparation Sample Quantification Analysis

4 Sample Preparation Library Preparation – Main Goals:
Prepares sample nucleic acids for sequencing Many library types and creation procedures exist However, all preparation results in the same general template structure: Double-stranded DNA flanked by two different adapters Variables include: Sequencing Application & Starting material (e.g. gDNA, mRNA, Mate Pair, Active Chromatin, ChIP-Seq) Insert Size Adaptor type Index for multiplexing

5 Example Sample Prep Workflow: TruSeq Paired-end Library
DNA RNA

6 Library Quantification - qPCR
Real-time qPCR allows accurate quantification of DNA templates: qPCR is based on the detection of a fluorescent reporter molecule that increases as PCR product accumulates with each cycle of amplification By using primers specific to the Illumina universal adapters in a qPCR reaction containing library template, only cluster-forming templates will be amplified and quantified

7 Library Quantification - qPCR
Threshold of florescence for amplicon to produce a Cq Plot Standard curve using controls and determine concentration of library Phases of qPCR: Geometric phase – amplicons doubling every cycle; greatest precision & accuracy for quantitation Cycle Threshold Cq – Cycle of Quantification Log initial concentration Take home: qPCR mimics what is happening on the surface of the flowcell during cluster generation and allows for determining optimal loading concentrations.

8 Cluster Generation Process occurs on cBot instrument:
Aspirates DNA samples into flow cell Automates the formation of amplified clonal clusters from the DNA single molecules 1000x amplification generates clusters Hybridizes sequencing primer(s)

9 Illumina cBot Cluster Generation 2.0
Automated system significantly reduces workload for generation of flowcells Compact design saves lab space Reagent cartridge reduces prep time

10 Flowcell

11 Cluster Generation Prep
Prepare reagents and denature & dilute library: The goal is to have the perfect cluster density to maximize yield (bp), this is achieved via optimized loading concentrations as determined by qPCR Considerations: Too low density: Fewer clusters, less sequence generated Too high density: Overlapping clusters, removed by analysis filters, poor quality

12 Cluster Generation Chemistry
Hybridization Amplification Linearization Blocking Primer hybridization

13 Cluster Generation Chemistry
Hybridize Sample fragments & extend:

14 Cluster Generation Chemistry
Bridge Amplification:

15 Cluster Generation Chemistry
Linearization, Blocking & Sequencing Primer Hybridization:

16 Sequencing Main Goals:
Translate the chemical information of the nucleotides into fluorescence information which can be captured optically The optical information is then transformed into text, which can be searched, aligned, or otherwise mined for biologically relevant data

17 Sequencing Workflow HiSeq Run Type Approx. Run Days 1x50 Flowcells 2
9 2x150 Flowcells 13

18 Sequencing by Synthesis
Clustered Flowcell is loaded on Illumina sequencer:

19 Sequencing Chemistry: First Cycle Base Incorporation
To initiate the first sequencing cycle, add all 4 fluorescently labeled reversible terminators and DNA polymerase enzyme to the flowcell. The complementary nucleotide will be added to the first position of each cluster. A laser is then used to excite the attached fluorophore.

20 Sequencing Chemistry: First Cycle Imaging

21 Sequencing Chemistry: Cycle 2 and so on…

22 Sequencing Read 2 Resynthesis of second strand for Read 2 occurs on sequencer without removing flowcell:

23 Index for Multiplex Sequencing
Sample multiplexing involves 3 reads: A: Sample Read 1 is sequenced B: Read 1 product removed and Index Read is sequenced C: Template strand used to generate complementary strand, and sample Read 2 is sequenced Analysis software identifies the index sequence from each cluster so that the sample reads 1 & 2 can be assigned to single sample

24 Illumina HiSeq2000 Sequencer
Nifty Lights

25 HiSeq2000 Reagents

26 1 HiSeq = 2 GAs

27 HiSeq2000 Fluidics Fluidics were the Achilles heel of the GA, and now 2X in the HiSeq

28 HiSeq2000 Fluidics

29 FY11 Service Metrics: Pareto

30 HiSeq: Temperature control
3 mechanisms: Heat extraction via liquid coolant Flow cell temperature control via Peltier Maintain reagents temperature via cooled compartment Reagent Chiller: All reagents cooled at 4C Condensation Pump runs every 4 min for 30 sec Flow cell sits on Peltier blocks, and is water cooled (heat extraction from underneath)

31 HiSeq Flowcell Loading

32 HiSeq Imaging

33 HiSeq Optics

34 HiSeq Lasers

35 HiSeq Software Interface

36 HiSeq Software Interface

37 HiSeq – Real Time Metrics

38 HiSeq vs GA

39 Cost & Throughput Comparison
Notes: Throughput metrics are averages from runs performed in FY11 for each of the run types to date Italicized HiSeq Bases & Reads throughput metrics are estimates based on 2x100 run type since we have limited data on other run types Only vendor reagent costs shown here; library creation and overhead costs are not included, but are roughly equal and are mostly independent of run type Cost per million reads goes up with the longer run types, but the readlength increases as well and this makes each read more valuable for some assembly applications HiSeq 2x150 run type not yet supported & the current HiSeq chemistry has worse quality beyond bases than compared to GA The HiSeq platform is still new and we are experiencing a higher number of hardware failures than GA; Illumina does replace reagents for failed runs and we rerun failed flowcells immediately whenever possible.

40 HiSeq Development Coming in early Summer:

41 HiSeq Development

42 HiSeq Development

43 Introducing MiSeq

44 MiSeq: all-in-one

45 MiSeq: Fast, low throughput

46 Providing Quality Sequence
Incident Reporting & Resolution (JIRA) Troubleshooting Procedures Throughput Goals & Metrics Continuous Improvement - Lean Six Sigma Failure Tracking & SPC Charts; RQC Instrument Status & real-time run monitoring Instrument Utilization & Efficiency

47 LLNL – Six Sigma Training
Tools and methodologies to: Improve work quality Improve process efficiencies & eliminate waste Improve employee and customer satisfaction Lean Six Sigma is about: Eliminating waste and improving process flow Focusing on reducing variation and improving process yield by following a problem-solving approach using statistical tools

48 What is Six Sigma? A Six Sigma process is literally one that’s statistically % successful. This is not always cost effective to achieve, so as a methodology it’s about gaining control of a process and implementing improvements.

49 What is Six Sigma? Y = f(x)
Six Sigma is a data driven problem solving approach where process inputs (Xs) are identified and optimized to impact the output (Y) The output is a function of the inputs and process Y: Output f: function X: variables that must be controlled to consistently predict Y Y = f(x)


Download ppt "Genome Biology for Programmers Lecture Series: Illumina Sequencing"

Similar presentations


Ads by Google