3 Goals of biomedical investigation Understand normal, healthy and disease biologyEnable prevention and early diagnosis of diseaseEnable new effective treatmentsUtility of Next Generation Sequencing/genetics in medicineUnbiased approach to identify new pathways underlying basic physiology, health and disease
4 Evolution of genomic technologies Genetic mapping studies: Discovery of genes for well characterized Mendelian diseases.Dense SNP genotyping using microarray technology: GWAS for discovery of common variants in common disease.High throughput sequencing: Discovery of rare variants in not previously recognized Mendelian diseases and common diseases.Constant and rapid changes in genomic technologies have driven successive eras of discovery of loci underlying human traits. The development of complete genetic maps of the human genome in the 1980’s fueled the mapping of Mendelian loci in extended kindreds for dominant traits and predominantly in consanguineous kindreds for recessive traits. Further accelerated by the acquisition of the sequence of the human genome in 2001, this first Mendelian era identified over 2800 disease loci and profoundly changed our understanding of the biology and pathophysiology of every organ system. Labor intensive and slow process.A second era, was defined by the development of microarray technology and identification more that 10 million common variants in human genome. The microarrays were developed to genotype 500K to 5 Million SNPs in order to identify common variants associated with human disorders This era led to the identification of more than 1000 loci that shows robust association with human disease that have changed the understanding of disease biology..We have recently entered a third era of discovery, this one driven by spectacular reductions in the cost of DNA sequencing from ~$100,000 per million bases in 1998 to ~$0.10 today on the HiSeq instrument. Coupled with our development of robust methods for selectively sequencing complete coding regions of the genome, which harbor the overwhelming majority of Mendelian loci, and analytic methods to rapidly and with high sensitivity and specificity identify variations from the reference sequence, one can now sequence ostensibly all the genes in the human genome (the exome) to high levels of completion for ~$1000 (direct cost). This has provided fundamental new opportunities for identifying Mendelian loci that were previously elusive.
5 Why High-Throughput DNA Sequencing Number of PubMed Articles DNA sequencing can provide a deeper understanding about DNA/RNA than any other technologyMicroarray Technology revolutionized biomedical research, but has several limitations, which DNA sequencing may overcomeAs the cost of sequencing is rapidly decreasing, it is becoming affordable to perform sequencing at a genome levelWhy do we need highthroughput DNA sequencing Center at Yale? There is no doubt that microarray technology has revolutionalized the biomedical research but has several limitations such as indirect observation based on hybridization signals which can non specific due to cross hybridization and also is not sensitive enough to identify low levels of chages. Also microarrays can provide the information about what is representated on the chips . DNA sequencing may be able to over come some of these limitations to provideNumber of PubMed ArticlesIn recent years there has been an explosion of research articles using next generation sequencing technologies
6 Applications of Next-gen Sequencing DNA Sequencing ApplicationsRe-SequencingExome sequencing Mutation/SNP discovery and profilingInteractomeDNA Protein InteractionsChIP SeqTranscriptome AnalysisAlternative splicing and allele specific expressionmicroRNAExpression and DiscoveryDiagnostic ServicesCLIA certificationEpigenomicsDNA MethylationDe Novo SequencingPopulation MetagenomicsCopy Number Variation
7 First Generation: Sanger sequencing ( )1980 Nobel Prize in chemistryphi X 174~5300 bpgels read by handradiolabeled dideoxyNTPsone lane per nucleotide800 bp readslow throughput (several kb/gel)
8 Second-generation sequencing Massively parallel sequencing of millions of template454/RocheIlluminaIon Torrent-Proton
11 HiSeq 2500 Sequencing System Fast turnaround and highest output in a single instrument 1 Instrument – 2 Run ModesHigh Output Mode600 Gb in ~10.5 daysCurrent v3 flow cellCurrent v3 reagentscBot requiredRapid Run Mode120Gb in ~1 dayNew 2-lane flow cellNew reagentsNo cBot requiredUser configurable6 human genomesin 10.5 days1 human genomein a dayHighest OutputFastest turnaround
12 New sequencing platforms by Illumina HiSeq X Ten and HiSeq X Five:Production-scale human whole genome sequencing: 18,000 genomes/year at $ 1,500 cost/genomeHiSeq 3000/HiSeq 4000:Up to 1.5 Tb/run.Whole genome as well as other applications including exome sequencing
13 Overall Illumina Sequencing Workflow Sample PreparationSequencing Library PreparationAdapter1Adapter2SequencingPrimerInsertCluster GenerationHybridizing Library to Flow CellCreating clusters fromindividual moleculesIntroductory workflow--- good to start with the basics and go from hereExplain that these 3 steps are 3 separate kits that one purchases. They can work with their salesperson to determine which kits and in what amounts they want to purchase. Emphasize that for any of our products (genomic, expression, chip, etc) that you follow these 3 basic steps: Sample Prep (library prep); Cluster Generation on a Flowcell, and Sequencing on the Genome Analzyer.For Sample Prep--- the processes used in the kits end up with a construct illustrated for all sequencing types--- 2 different adaptors, a sequencing primer, and an insert. If the group will do paired end, can mention it’ll be slightly different adapters, and different sequencing primers on both ends of the insert (will be confusing for a new group--- can come back to this slide later if someone asks).Cluster generation-- Show them the flowcell picture--- 8 lanes for 8 different samples. Library hybridizes to flowcell with individual molecules forming clusters that will be sequenced. The different molecules of the library are physically separated from one another so the sequence of each one can be determined.Sequencing by Synthesis--- describe the general process with the reversible terminators. Can introduce the concept that the GA has a “chemistry cycle” where you are removing the last block and then adding the next particular base, then an imaging cycle.Sequencing by SynthesisAdd all 4 bases with Reversible TerminatorsImage 4 colorsRemove Terminator, repeat
14 Genomic Sample Prep Workflow Purified genomic DNA1. Genomic DNA fragmentationFragments of less than 800 bp2. End-repairBlunt ended fragments with 5’-Phosphorylated ends3. Klenow exo- with dATP3’-dA overhang4. Adapter ligationAdapter modified ends5. Gel purification/beadRemoval of unligated adapter6. PCRGenomic DNA LibraryWe’re using Genomic Sample Prep Workflow as an example of the basic sample prep protocol, each being different. All sample prep methods come with their own protocol which follow standard molecular biology cloning techniques.Adapter1Adapter2SequencingPrimerInsert
15 What is a Flow Cell?A flow cell is a thick glass slide with 8 channels or lanesEach lane is randomly coated with a lawn of oligos that are complementary to library adaptersP5 oligoP7 OligoAdapter1Adapter2InsertSequencing PrimerIndex
16 Reversible Terminator Seq Chemistry All 4 labeled nucleotides in 1 reaction (green, orange, red and blue)Advantages of reversible terminators:Only one base is added at a timeFluor can be cleaved off after the imaging. Thus, it does not emit color at the next cycle allowing only newly added base (with attached fluor) to emit the lightNext cycleIncorporationDetectionDeblock; fluor removalODNAHNN3’5’free 3’ endXOHOPPPHNNcleavagesitefluor3’block
18 Sequencing By Synthesis (SBS) 5’3’5’Cycle 1: Add sequencing reagentsFirst base incorporatedRemove unincorporated basesGTCADetect signal/ImagingTGCleave off fluor and DeblockCAGTCycle 2-n: Add sequencing reagents and repeatAll four labeled nucleotides in one reactionHigh accuracyBase-by-base sequencingNo problems with homopolymer repeatsHCS:1.8.6
19 4 Ion Protons: coming soon Ion Torrent PGM and ProtonIon PGM™ Sequencer4 Ion Protons: coming soonFirst PostLight sequencing technology: Instead of using light as an intermediary, PGM creates a direct connection between the chemical and the digital worlds.
20 The Chip is the Machine Uses semiconductor chips for sequencing. Ion PI chip: >165 million wells per chip: 8 to 10 Gb data per runIon PII chips: ~100 Gb of data in ~4 hours
21 Base CallingWhen a nucleotide is incorporated into a strand of DNA, a Hydrogen ion is released as a by product. The H ion carries a charge which the PGM’s ion sensor can detect as a base.Ion Torrent technology video.
22 Advantages and Current Limitations Low equipment costRapid run times: 3 to 4 hoursSimple ChemistryLimitationsHomopolymers detectionError ratesSlow on introducing newer chips: OverpromisePGM and Proton: two separate sequencing equipmentLibrary prep: Emulsion PCR/ New protocols
24 The Third Generation Sequencing Platform: PacBio RS Pacific Biosciences has developed Single Molecule Real Time (SMRT™) DNA sequencing technology: PacBio RS.This technology enables, for the first time, the observation of natural DNA synthesis by a DNA polymerase as it occurs.This technology delivers long reads at single molecule level and fast time to result, enabling a new paradigm in genomic analysis.Most people here are familiar with the Sanger sequencing which is the so callled first generation sequencing; and second generation sequencing technology such as illumina hiseq system. It starts with library prep with PCR amplification and cluster building. After sequencing, it generate tens of millions of short reads. Today I am going to introduce you the third generation sequencing platform pacbio RS, developed by Pacific Biosciences that can do single molecule real time sequencing. the technology is called SMRT for single molecule real time. This technology enables, for the first time, the observation of natural DNA synthesis by a DNA polymerase as it occurs. The major advantages of this new sequencing technology is that it can delivers long reads at single molecule level and fast time to results.
25 Pacific Biosciences SMRT® Technology Technology Video
26 Key Applications for PacBio RS Targeted sequencingSNP and structure variants detectionRepetitive regionFull length transcript profilingDe novo assembly and genome finishingBacteria genomeFungal genomeGap-captured sequencingTargeted captured sequencingBase modifications detectionMethylationsDNA damagesFirst I want to show you a paper published in nature last week using pacbio sequencing to identifiy mutations in a kinase FLT3, which is associated with AML % of the aml patient would have this ITD mutation. This is an activating mutation and there are drugs can effectively inhibits the kinase. But the problem is that the drug develops resistance after certain time. And the drug resistance is likely caused by few mutations in the kinase domain. So to find out whether the ITD mutation and the drug resistance mutation are really the disease causing mutations, they have to determine whether any resistant mutations found were from the same strand as the FLT3-ITD.**Projects at YCGAYCGA PacBio RS
27 Comparisons Between PacBio RS and Illumina HiSeq PacBio RS (Third generation)Illumina HiSeq (Second generation)Sequencing ChemistrySequencing by synthesis (SBS)Single Molecule Real Time (SMRT)Sequencing substrateSmart Cell made up of150,000 ZMWsFlow cell has made of8 separate lanesData output per day1 to 2 billion/ day. $1.5/ Mb60 billion/day at a cost of $.06 per MbRead LengthAverage up to 5 Kb50bp to 150bpError ratesRaw: %. With 30x coverage: Q50 (< 0.01)0.5 to 1 %Sample LibrarySMRT Bell template(Single-strand circular DNA) 250 bp to 10 Kb insertdsDNA with adaptors (175 bp to 1 Kb)As shown in this table. Bothe technologies are using the sequencing by synthesis chemistry. The difference for pacbio is that it performs the sequencing at the single molecule level and in real time. the sequencing comsumables are also different. In hiseq, the sequencing is carried on the flowcell, that has 8 lanes, each lane has millions of DNA clusters. For pacbio, the sequencing is carried on a SMRT cell, which is comprised of 150k microscopic holes called ZMWs which stands for Zero Mode Wavelenghth. Each Zmw is a can hold one dna molecule with primer and polymerase. For the base calling, illumina is using the images taken during the sequencing run. And pacbio is using the movies that is collected in real time while the dna synthesis is happening.
29 Oxford Nanopore Technology ExonucleaseProtein nanopore (Alpha Hemolysin)CyclodextrinElectrically resistant Lipid bilayerSilicon nitride or graphene. This diagram shows a protein nanopore set in an electrically resistant membrane bilayer. An ionic current is passed through the nanopore by setting a voltage across this membrane. If an analyte passes through the pore or near its aperture, this event creates a characteristic disruption in current. Measurement of that current makes it possible to identify the molecule in question. For example, this system can be used to distinguish between the four standard DNA bases G, A, T and C, and also modified bases. It can be used to identify target proteins and small molecules, or to gain rich molecular information, for example to distinguish between the enantiomers of ibuprofen or study molecular binding dynamics.
31 Recent advances in nanopore sequencing Two types of nanopores: Protein and synthetic (silicon nitride). Protein nanopores appear to be better in recognizing nucleotides.The rapid speed at which DNA strands pass through the tiny hole makes distinguishing bases more difficult.Currently an enzyme is used to control the rate.By shining low power green laser on synthetic nanopore immersed in salt water it is possible to manipulate DNA speed at will. As the current increases, positive ions drag water molecules inMeller A. et al, Nat Biotech 2013the opposite direction of incoming DNA, acting as a brake and slowing its passage through the pore. As a result, nanoscale sensors in the pore would be more accurately able to read each nucleotide going into the pore. Using nanopores, long stretches of DNA can be zipped back and forth through the pore and can be read several timesProtein nanopoers can also identify epigenetic changes.The rapid speed at which DNA strands pass through the tiny holes makes distinguishing bases more difficult. They showed that shining a certain wavelength of light could slow the flow of DNA through synthetic nanopores, potentially making it easier to read the four bases that make up each molecule.Reporting in the November 2013 issue of Nature Nanotechnology, Dr. Meller's group found that by shining a low-power green laser on a synthetic nanopore made of a thin layer of silicon nitride, it was possible to increase the electric charge near the walls of the pore, which is immersed in salt water. As the current increases, positive ions drag water molecules in the opposite direction of incoming DNA, acting as a brake and slowing its passage through the pore. As a result, nanoscale sensors in the pore would be more accurately able to read each nucleotide going into the pore.
32 Performance/Limitations…..? AdvantagesNanopores offer a label-free, electrical, single-moleculeDNA sequencing methodNo costly fluorescent labeling reagentsNo need for expensive optical hardware and sophisticatedinstrumentation to detect DNA basesPerformance/Limitations…..?First data was released in Feb Since then slow to release new dataVery little data available for the evaluation: High Error Rates - >5%
34 Located in a newly renovated building. YCGA was established in January 2009 through generous funding support and the strong commitment from the Yale University and School of MedicinePortion of the laboratory showing sequencing systems through the glass wall partition that separates laboratory from the rest of office and administrative area.Located in a newly renovated building.Approximately 7,000 Sq Ft laboratory and ~4,000 Sq Ft officespace23 staff
35 Sequencing Platforms at YCGA 11 Illumina HiSeqs (2000 and 2500)One MiSeqIon PGM™ SequencerOne PacBio RSYCGA is well equipped with cutting edge technologies . Since the technology keeps improving at a very fast pace, it has been a challenge to keep up with it. New technologies are expensive and some times we have to change the platform before we have recovered the investments. Despite numerous challenges YCGA has been very successful in keeping up with the change while maintaining data production and balancing operating budget..YCGA has kept pace with cutting-edge sequencing technologies
38 Types of samples processed and runs of sequence read lengths carried out at YCGA in a typical month
39 Need for strong R&D efforts for Next-Generation sequencing operation Optimization of sample preparation protocols for exome capture that have decreased the cost of a single human exome from $8,000 in 2009 to the current price of ~$500, while improving the quality of the data.Development of a highly efficient protocol to extract and repair DNA from formalin-fixed paraffin embedded blocks for exome analysis.Improved protocols for gDNA-seq, RNA-seq, and ChIP-seq that show higher data complexity than traditional protocols, allow users to start with less material, and cost less.Continuous improvements of various analysis pipelinesThis point is extremely significant because >90% of our sequencing is human exomes. The improvements we have made have increased our data quality, decreased our costs, and allowed us to dramatically increase our throughput.There are likely billions of formalin-fixed paraffin embedded (FFPE) samples around the world. The fixation/storage process destroys the DNA and it was thought these samples would be unusable for genetic analysis. Our protocol allows us to use these samples for exome analysis and makes many new and interesting experiments possible that would otherwise be impossible to perform.By spending the time and money to improve all of our protocols – not just human exomes – we are able to offer the Yale community a variety of sample preparation options that produce the most complex data possible at some of the lowest costs in the country.
40 Whole- Genome VS. Whole Exome Sequencing Protein coding genes (exome) constitute 1% of the human genome but harbor 85 % of disease causing mutationsSignificantly cheaper than sequencing entire genomeMaq,2.1M probes cover ~300,000exons of 19,000 genesTotal covered bases: 44.1Mb
41 Scientific and economic impact of high throughput sequencing at Yale
42 Spatio-temporal transcriptome of the human brain. Kang and Sestan List of select publications resulting form the next-generation sequencing at YCGAWhole-exome sequencing identifies recessive WDR62 mutations in severe brain malformations. BilguvarNature, v467, 2010A Novel miRNA Processing Pathway Independent of Dicer Requires Argonaute2 Activity. CifuentesScience, v328, 2010Mitotic recombination in ichthyosis causes reversion of dominant mutations in KRT10. Choate KScience, v330, 2010Transcriptomic analysis of avian digits reveals conserved and derived digit identities in birds. Wang s.Nature, v477, 2011Transposom-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Lynch and WagnerNature, Genet. v43, 2011K+ channel mutations in adrenal aldosterone-producing adenomas and hereditary hypertension. Choi MScience, v331, 2011Recessive LAMC3 mutations cause malformations of occipital cortical development. Barak and Gunel.Nat Genet., V43, 2011Spatio-temporal transcriptome of the human brain. Kang and SestanNature, v478, 2011Langerhans cells facilitate epithelial DNA damage and squamous cell carcinoma. Modi and GirardiScience, v335, 2012Mutations in kelch-like 3 and cullin 3 causes hypertension and electrolyte abnormalities. Boyden et alNature, v482, 2012De novo point mutations, revealed by whole-exome sequencing, are strongly associated with Autism Spectrum Disorders. Sanders and StateNature, v485, 2012Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. KrauthammerNat Genet., V44, 2012Genomic Analysis of Non-NF2 Meningiomas Reveals Mutations in TRAF7, KLF4, AKT1,& SMO. Clark V et alScience, v339, 2013De novo mutations in histone-modifying genes in congenital heart disease. Zaidi and LiftonNature, v498, 2013Recessive mutations in DGKE cause atypical hemolytic-uremic syndrome. Lemaire and LiftonNat Genet., V45, 2013Somatic and germline CACNA1D calcium channel mutations in aldosterone-producing adenomas and primary aldosteronism. Scholl and LiftonThe evolution of lineage-specific regulatory activities in the human embryonic limb. Cotney and NoonanCell, v154, 2013Mutations in DSTYK and dominant urinary tract malformations. Sanna-Cherchi and GharaviN Eng J Med., 2013Nanog, and SoxB1 activate zygotic gene expression during the maternal-to-zygotic transition. Lee et alNature, 2013Co-expression networks implicate human mid-fetal deep cortical projection neurons in the pathogenesis of autism. Willsey and StateCell, 2013CLP1 Founder Mutation Links tRNA Splicing and Maturation to Cerebellar Development and Neurodegeneration. Schaffer AE and Gleeson JG.Cell, V157, 2014Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders. Novarino G and Gleeson JG.Science, V363, 2014
43 Impact of High Throughput Sequencing: Grant Funding (partial list) Mendelian center grant, NIH $12M (3y)Gilead cancer grant $40M (4y)Brain tumor gift $12M (4y)ARRA brain development (NIH) $ 3M (2y)ARRA kidney disease (NIH) $ 2M (2y)Simons autism sequencing $ 4M (3y)Brain transcriptome (NIH) $10M (2y)Congenital heart disease (NIH) $ 5M (4y)Pediatric Cardiac Genomic Consortium $ 2M (2Y)Melanoma Spore (NIH) $12M (5y)Biogen Inc. (PPMS) $ 2 MVA- Schizophrenia/Bipolar disorder $12 MYale Comprehensive Cancer Center $14 MTotal $ 128 M
44 NGS and Personalized Medicine Use of genomics to tailor medical care to individuals based on their genetic makeup.Which treatment?What are mychances?Which class ofcancer?Is it benign?TherapeuticChoicePrognosisDiagnosisClassificationHow and whyDiscoveryElucidation of mechanism of causeIdentification of cancer biomarkersTherapeutic targetsThe use of microarrays can tell us a lot about specific disease such as cancer.They can help us to(1) Diagnose the specific cancer in a quick and accurate way(2) Classify the specific subtype of cancer to allow for the best treatment(3) Allow the most accurate prognosis of recovery based on the genetics of the tumor(4) Identify the specific treatment for each type of specific genetic profile. The arrays can be used to study the genetics behind how a certain type of cancer reacts to a specific treatment
45 CLIA: The New Paradigm in Molecular Diagnostics Conventional molecular testing- gene by geneGenomic testing using Exome analysisYCGA is carrying out clinical diagnostic work in collaboration with Dr. Allen BaleOver 1,000 exomes are analyzed for various disorders
46 Challenges Equipment, reagents, protocols, analysis What is valid and what is significant?Individual judgment versus consensus guidelines
47 Sequencing a genome is simple finding a cause of a disease is not First clinical use of whole genome sequencing shows just how challenging it can be.Study of fraternal twins with monogenic disorderGenome was sequenced of fraternal twins diagnosed with a movement disorderSci Transl Med Jun 15;3(87):87re3. doi: /scitranslmedWhole-genome sequencing for optimized patient management.Bainbridge MN1, Wiszniewski W, Murdock DR, Friedman J, Gonzaga-Jauregui C, Newsham I, Reid JG, Fink JK, Morgan MB, Gingras MC, Muzny DM, Hoang LD, Yousaf S, Lupski JR, Gibbs RA.AbstractWhole-genome sequencing of patient DNA can facilitate diagnosis of a disease, but its potential for guiding treatment has been under-realized. We interrogated the complete genome sequences of a 14-year-old fraternal twin pair diagnosed with dopa (3,4-dihydroxyphenylalanine)-responsive dystonia (DRD; Mendelian Inheritance in Man #128230). DRD is a genetically heterogeneous and clinically complex movement disorder that is usually treated with l-dopa, a precursor of the neurotransmitter dopamine. Whole-genome sequencing identified compound heterozygous mutations in the SPR gene encoding sepiapterin reductase. Disruption of SPR causes a decrease in tetrahydrobiopterin, a cofactor required for the hydroxylase enzymes that synthesize the neurotransmitters dopamine and serotonin. Supplementation of l-dopa therapy with 5-hydroxytryptophan, a serotonin precursor, resulted in clinical improvements in both twins.Genomes on prescription: Nature 2011Bainbridge M, Sci Transl Med 2011
48 Acknowledgement Jim Noonan Yale University, School of Medicine and west CampusNHGRI: CMGYCGA staff
51 Data Analysis Overview Primary AnalysisSecondary AnalysisData Visualization
52 Primary and Secondary Analysis Overview Analysis TypeSoftwareOutputsImages/TIFF filesSequencingICS/RTABase CallingIntensitiesPrimary AnalysisICS/RTAAlignments and Variant DetectionSecondary Analysis
53 Cluster Generation: Amplification Template hybridization and Initial ExtensionOriginal template is washed awayTemplatehybridizationInitial extensionDenaturation3' extensionOHOHP P5Grafted flowcellInitials steps for the PE chemistry are the same as the Single Read chemistry.single molecules bound to flow cell in a random pattern> million single molecules hybridize to the lawn of primers
54 Cluster Generation: Amplification Result: two copies of covalently bound single-stranded templatesSingle-strand flips over to hybridize to adjacent oligos to form a bridgeHybridized primer is extended by polymerasesDouble-stranded bridge is denatured2nd cycledenaturation1st cycleextension1st cycleannealing1st cycledenaturation2nd cycleannealingn=35total2nd cycleextensionAmplification steps are also the same except that 28 cycles is recommended for any samples where the insert is greater than 200 bp. More cycles for samples with insets greater then 200 bp will cause the clusters to get too large after P5 resynthesis (see slide 6)
55 Cluster Generation: Linearization, Blocking and sequencing Cluster Generation: Linearization, Blocking and sequencing primer hybridizationdsDNA bridges are denaturedcomplement strands are cleaved and washed awaysequencingprimerP5 LinearizationBlock withddNTPSDenaturation andSequencing PrimerHybridizationClusterAmplificationThe first linearization step uses the Linearization 1 Enzyme instead of Periodate. The blocking step still uses ddNTPs but uses Blocking Enzyme 1 and 2 in the PE protocol instead of terminal transferase for the Single Read protocol. Read 1 primer hybridization uses Read 1 PE Sequencing Primer. Enzymatic cleavage, uracyl incorporation enzymeFree 3’ ends are blocked to prevent unwanted DNA priming
56 Sequencing Resynthesis of P5 Strand (15Cycles) Sequencing First Read OHSequencingFirst ReadDenaturation andDe-ProtectionOHDenaturation andHybridizationP7 LinearizationOHSequencingSecond ReadDenaturation andHybridizationBlock withddNTPsThe steps up to and including the first read sequencing are pretty much the same as for a single read. The first read sequencing is where the single read protocol would stop. For the PE protocol, it continues with deprotecting the P5 primer using deprotection enzyme. Resynthesis of the P5 strand occurs over 15 cycles. P7 linearization uses Linearization 2 Enzyme. Blocking again occurs with ddNTPS and Blocking enzyme 1 and 2. Sequencing read 2 uses Read 2 PE Sequencing Primer.