Presentation on theme: "Chem 395 Bioanalytical Chemistry"— Presentation transcript:
1 Chem 395 Bioanalytical Chemistry Storrs, ConnecticutFebruary 8, 2011Modern Genomic Analysis Methods Janet Hager Translational Genomics Core Department of Genetics and Developmental Biology University of Connecticut Health Center
2 Genomic Resources at the UCHC Translational Genomics Core Department of Genetics and Developmental BiologyCell and Genome Sciences Center Farmington Avenue4Sample QCAgilent Bioanalyzer 2100Nanodrop 1000MicroarraysIllumina BeadStation and BeadChip: Gene Expression and Infinium and GoldenGate genotypingIllumina BeadXpress: Custom Veracode genotyping panelsAffymetrix Chip and Fluidics Station: Gene Expression, CNV, genotypingDeep SequencingIllumina Genome Analyzer IIxIlumina HiSeq2000Ill
4 Illumina Next Gen Sequencing Method The basic goal for sample prep in all 2nd generation next gen sequencing are the same:Generate large numbers of unique "polonies" (polymerase generated colonies) or “clusters” that can be simultaneously sequenced.These parallel reactions occur on the surface of a "flow cell" (basically a water-tight microscope slide) which provides a large surface area for many thousands of parallel chemical reactions.
6 Illumina Next Gen Sequencing Method Step 1: Sample PreparationDNA sample of interest is sheared to appropriate size (average ~800bp) using a compressed air device known as a nebulizer.Ends of the DNA are polished, and two unique adapters are ligated to the fragments.Ligated fragments of the size range of bp are isolated via gel extraction and amplified using limited cycles of PCR.Complete detailed protocols for DNA, RNA and small RNA library preparationThis process is a fairly straightforward multi-step molecular biology process
8 Illumina Next Gen Sequencing Method Steps 2-6: Cluster Generation by Bridge Amplification454 and ABI SOLiD methods use a bead-based emulsion PCR to generate "polonies”Illumina uses a unique "bridged" amplification reaction that occurs on the surface of the flow cell.Flow cell surface is coated with single stranded oligonucleotides that correspond to the sequences of the adapters ligated during the sample preparation stage.Single-stranded, adapter-ligated fragments are bound to the surface of the flow cell exposed to reagents for polyermase-based extension.
11 Illumina Next Gen Sequencing Method Steps 2-6: Cluster Generation by Bridge AmplificationRepeated denaturation and extension results in localized amplification of single molecules in millions of unique locations across the flow cell surface.This process occurs in what is referred to as Illumina's "cluster station", an automated flow cell processor.Priming occurs as the free/distal end of a ligated fragment "bridges" to a complementary oligo on the surface.
15 Illumina Next Gen Sequencing Method Steps 7-12: Sequencing by SynthesisFlow cell containing millions of unique clusters is now loaded into the sequencer for automated cycles of extension and imaging.The first cycle of sequencing consists first of the incorporation of a single fluorescent nucleotide, followed by high resolution imaging of the entire flow cell.Images represent the data collected for the first base. Any signal above background identifies the physical location of a cluster (or polony), and the fluorescent emission identifies which of the four bases was incorporated at that position.This cycle is repeated, one base at a time, generating a series of images each representing a single base extension at a specific cluster.Base calls are derived with an algorithm that identifies the emission color over time. Illumina reads range from bases.
16 Illumina Next Gen Sequencing Method Structure of reversible terminator
19 Illumina Genome Analyzer IIx The Genome AnalyzerIIx has a single 8 sample flowcell = 8 samples per runup to 2 x 150 bp read lengthsup to 640 million paired-end reads per flow cell,enabling a broad range of high-throughput sequencing applications:RNA-seq: expression, isoform, splice variantssmall RNA-seq: microRNA, ncRNAWhole genome-seq: SNP variants, CNVExome-seq: expression, variantsChIP-seq: transcription factor regulatory sites
39 Other 2nd Generation Sequencing Platforms ABI SOLiD – sequencing by ligation; two base calling; short read.Roche 454 – emulsion PCR; pyrosequencing; long readLife Technologies Ion Torrent PGM – Personal Genome Machine. Ion Torrent uses the simplest sequencing chemistry including natural nucleotides, no enzymatic cascade, no fluorescence, no chemiluminescence, no optics, no light: The Chip is the Machine.TMHelicos – Single molecule sequencing
42 Ion TorrentIon Torrent uses a high-density array of micro-machined wells to perform this biochemical process in a massively parallel way.Each well holds a different DNA template generated by emulsion PCR. Beneath the wells is an ion-sensitive layer and beneath that a proprietary Ion sensor.The Ion Personal Genome Machine (PGM™) sequencer then sequentially floods the chip with one nucleotide after another.If the next nucleotide that floods the chip is not a match, no voltage change will be recorded and no base will be called.
43 Ion Torrent Semiconductor based technology When a nucleotide is incorporated into a strand of DNA by a polymerase, a hydrogen ion is released as a byproduct.
44 3rd Generation Sequencing Platforms Pacific Bioscience PacBio RS – Single molecule, real-time analysis. SMRT DNA sequencing is performed on SMRT Cells, each patterned with 150,000 zero mode waveguides or ZMWs. Each ZMW contains a single DNA polymerase, providing the window to observe DNA sequencing in real-time. The PacBio RS system continuously monitors ZMWs in sets of 75,000 at a time.Oxford Nanopore – Nanopores offer a label-free, electrical, single-molecule DNA sequencing method, obviating the need for amplification or labeling, by detecting a direct electrical signal. While the evolution of other technologies relies on improvements in existing chemical, optical or bioinformatics procedures, nanopores will bypass these to deliver a genuinely revolutionary sequencing method.
45 Nanopore sequencing technology development Exonuclease sequencingUsing a processive enzyme to cleave individual nucleotides from a DNA strand and pass them through a protein nanopore. Oxford Nanopore has a commercialisation agreement with Illumina for this method.Strand sequencingIdentifying individual nucleotides on a DNA strand as it passes intact through a protein nanopore.Solid state sequencingUsing synthetic materials, rather than protein pores, to create nanopores.Nanopore sequencing technology development
46 Structure of the Protein Nanopore Exonuclease sequencing: Combining a protein nanopore and processive enzyme for the sequential identification of DNA bases as they pass through the pore.Alpha haemolysin nanopore showing cyclodextrin adapter molecule (the DNA binding site).In the exonuclease sequencing method, a protein nanopore is coupled with a processive enzyme, an exonuclease. The enzyme cleaves individual DNA bases from a DNA strand. These bases enter the nanopore and undergo a binding event before passing through the pore. During this binding event, they cause characteristic disruption in current that can be used to identify the DNA bases in sequence. This method is combined with Oxford Nanopore's proprietary electronics system for scaled-up nanopore sensing.
47 Covalent attachment of a cyclodextrin molecule to the inside surface of the nanopore. Acts as a binding site for individual DNA bases and allows accurate measurement of their passage through the nanopore binding site.Allows the identification of individual nucleoside 5’-monophosphate molecules (DNA bases) to a standard commensurate with a high resolution DNA sequencing technology.Electronic trace showing DNA bases in solution entering the nanopore.The bases individually, transiently bind to the cyclodextrin adapter.Each time a base passes through the pore there is a disruption in the current.Diagram shows four different magnitudes of disruption which can be classified as C, G, A or T.
49 Examples of Computational Solutions Cloud computing tools such as Crossbow is a cloud-computing software tool that combines the aligner Bowtie and the SNP caller SOAPsnp.Executing in parallel using Hadoop, Crossbow analyzes data comprising 38-fold coverage of the human genome in three hours using a 320-CPU cluster rented from a cloud computing service for about $85. Crossbow is available from webcite.DNAnexus – a cost effective storage and computational toolset – web based, running on the cloudGalaxy – suite of genomic tools developed and hosted by Penn State. Web based and running on Penn servers with NHGRI support
50 Development of Computational Biology Programs The data anaysis challenge is huge!We have made significant progress and investment in the tools for genomic studiesProgress is being made in building infrastructure for collection and integration of clinical dataFocus needed on developing a program in computational biology to attract faculty and research analystsTo assist clinicians and scientists with highthroughput genomic data analysis