4 Sample Preparation Library Preparation – Main Goals: Prepares sample nucleic acids for sequencingMany library types and creation procedures existHowever, all preparation results in the same general template structure:Double-stranded DNA flanked by two different adaptersVariables include:Sequencing Application & Starting material (e.g. gDNA, mRNA, Mate Pair, Active Chromatin, ChIP-Seq)Insert SizeAdaptor typeIndex for multiplexing
5 Example Sample Prep Workflow: TruSeq Paired-end Library DNARNA
6 Library Quantification - qPCR Real-time qPCR allows accurate quantification of DNA templates:qPCR is based on the detection of a fluorescent reporter molecule that increases as PCR product accumulates with each cycle of amplificationBy using primers specific to the Illumina universal adapters in a qPCR reaction containing library template, only cluster-forming templates will be amplified and quantified
7 Library Quantification - qPCR Threshold of florescence for amplicon to produce a CqPlot Standard curve using controls and determine concentration of libraryPhases of qPCR: Geometric phase – amplicons doubling every cycle; greatest precision & accuracy for quantitationCycle ThresholdCq – Cycle of QuantificationLog initial concentrationTake home: qPCR mimics what is happening on the surface of the flowcell during cluster generation and allows for determining optimal loading concentrations.
8 Cluster Generation Process occurs on cBot instrument: Aspirates DNA samples into flow cellAutomates the formation of amplified clonal clusters from the DNA single molecules1000x amplification generates clustersHybridizes sequencing primer(s)
9 Illumina cBot Cluster Generation 2.0 Automated system significantly reduces workload for generation of flowcellsCompact design saves lab spaceReagent cartridge reduces prep time
11 Cluster Generation Prep Prepare reagents and denature & dilute library:The goal is to have the perfect cluster density to maximize yield (bp), this is achieved via optimized loading concentrations as determined by qPCRConsiderations:Too low density: Fewer clusters, less sequence generatedToo high density: Overlapping clusters, removed by analysis filters, poor quality
15 Cluster Generation Chemistry Linearization, Blocking & Sequencing Primer Hybridization:
16 Sequencing Main Goals: Translate the chemical information of the nucleotides into fluorescence information which can be captured opticallyThe optical information is then transformed into text, which can be searched, aligned, or otherwise mined for biologically relevant data
17 Sequencing Workflow HiSeq Run Type Approx. Run Days 1x50 Flowcells 2 92x150 Flowcells13
18 Sequencing by Synthesis Clustered Flowcell is loaded on Illumina sequencer:
19 Sequencing Chemistry: First Cycle Base Incorporation To initiate the first sequencing cycle, add all 4 fluorescently labeled reversible terminators and DNA polymerase enzyme to the flowcell.The complementary nucleotide will be added to the first position of each cluster.A laser is then used to excite the attached fluorophore.
22 Sequencing Read 2Resynthesis of second strand for Read 2 occurs on sequencer without removing flowcell:
23 Index for Multiplex Sequencing Sample multiplexing involves 3 reads:A: Sample Read 1 is sequencedB: Read 1 product removed and Index Read is sequencedC: Template strand used to generate complementary strand, and sample Read 2 is sequencedAnalysis software identifies the index sequence from each cluster so that the sample reads 1 & 2 can be assigned to single sample
30 HiSeq: Temperature control 3 mechanisms:Heat extraction via liquid coolantFlow cell temperature control via PeltierMaintain reagents temperature via cooled compartmentReagent Chiller:All reagents cooled at 4CCondensation Pump runs every 4 min for 30 secFlow cell sits on Peltier blocks, and is water cooled (heat extraction from underneath)
39 Cost & Throughput Comparison Notes:Throughput metrics are averages from runs performed in FY11 for each of the run types to dateItalicized HiSeq Bases & Reads throughput metrics are estimates based on 2x100 run type since we have limited data on other run typesOnly vendor reagent costs shown here; library creation and overhead costs are not included, but are roughly equal and are mostly independent of run typeCost per million reads goes up with the longer run types, but the readlength increases as well and this makes each read more valuable for some assembly applicationsHiSeq 2x150 run type not yet supported & the current HiSeq chemistry has worse quality beyond bases than compared to GAThe HiSeq platform is still new and we are experiencing a higher number of hardware failures than GA; Illumina does replace reagents for failed runs and we rerun failed flowcells immediately whenever possible.
46 Providing Quality Sequence Incident Reporting & Resolution (JIRA)Troubleshooting ProceduresThroughput Goals & MetricsContinuous Improvement - Lean Six SigmaFailure Tracking & SPC Charts; RQCInstrument Status &real-time run monitoringInstrument Utilization & Efficiency
47 LLNL – Six Sigma Training Tools and methodologies to:Improve work qualityImprove process efficiencies & eliminate wasteImprove employee and customer satisfactionLean Six Sigma is about:Eliminating waste and improving process flowFocusing on reducing variation and improving process yield by following a problem-solving approach using statistical tools
48 What is Six Sigma?A Six Sigma process is literally one that’s statistically % successful.This is not always cost effective to achieve, so as a methodology it’s about gaining control of a process and implementing improvements.
49 What is Six Sigma? Y = f(x) Six Sigma is a data driven problem solving approach where process inputs (Xs) are identified and optimized to impact the output (Y)The output is a function of the inputs and processY: Outputf: functionX: variables that must be controlled to consistently predict YY = f(x)