“First generation" sequencing technologies and genome assembly

Slides:



Advertisements
Similar presentations
SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
Advertisements

9 Genomics and Beyond Brief Chapter Outline
Recombinant DNA Introduction to Recombinant DNA technology
DNA Sequencing and Gene Analysis
3 September, 2004 Chapter 20 Methods: Nucleic Acids.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
DNA Sequencing Today, laboratories routinely sequence the order of nucleotides in DNA. DNA sequencing is done to: Confirm the identity of genes isolated.
Reading the Blueprint of Life
14.3 Studying the Human Genome
Recombinant DNA Technology for the non- science major.
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
CHAPTER 20 BIOTECHNOLOGY: PART I. BIOTECHNOLOGY Biotechnology – the manipulation of organisms or their components to make useful products Biotechnology.
Chapter 20 DNA Technology and Genomics
AP Biology Ch. 20 Biotechnology.
Trends in Biotechnology
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 3 Fundamentals of Mapping and Sequencing Basic principles.
Section 2 Genetics and Biotechnology DNA Technology
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
 The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
DNA TECHNOLOGY AND GENOMICS CHAPTER 20 P
Cloning and Expression of Genes
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
Human Genome.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
GENETIC ENGINEERING CHAPTER 20
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
Chapter 10: Genetic Engineering- A Revolution in Molecular Biology.
AP Biology Biotech Tools Review AP Biology Biotech Tools Review  Recombinant DNA / Cloning gene  restriction enzyme, plasmids,
Chapter 20: DNA Technology and Genomics - Lots of different techniques - Many used in combination with each other - Uses information from every chapter.
Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism.
DNA Technology Ch. 20. The Human Genome The human genome has over 3 billion base pairs 97% does not code for proteins Called “Junk DNA” or “Noncoding.
Plan A Topics? 1.Making a probiotic strain of E.coli that destroys oxalate to help treat kidney stones in collaboration with Dr. Lucent and Dr. VanWert.
Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
DNA Sequencing First generation techniques
Gene Cloning Techniques for gene cloning enable scientists to prepare multiple identical copies of gene-sized pieces of DNA. Most methods for cloning pieces.
Genetic Engineering.
Chapter 7 Recombinant DNA Technology and Genomics
DNA Technologies (Introduction)
Genomics Sequencing genomes.
PCR & electrophoreisis
DNA Technology Ch 13.
Pre-genomic era: finding your own clones
Section 3: Gene Technologies in Detail
COURSE OF MICROBIOLOGY
Genetic Research and Biotechnology
Section 2 Genetics and Biotechnology DNA Technology
Chapter 20: DNA Technology and Genomics
Chapter 20 – DNA Technology and Genomics
The Human Genome Project
5. Genetic Engineering Techniques
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
DNA and the Genome Key Area 8a Genomic Sequencing.
A Sequenciação em Análises Clínicas
Plant Biotechnology Lecture 2
Provide a genuine experience in using cell and molecular biology to learn about a fundamental problem in biology. Rather than following a set series of.
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Chapter 20: DNA Technology and Genomics
Presentation transcript:

“First generation" sequencing technologies and genome assembly Roger Bumgarner Associate Professor, Microbiology, UW Rogerb@u.washington.edu

Overview How to sequence any DNA How to sequence a lot of DNA What have we learned from 20 years of the genome project? What’s next?

Intended outcomes An understanding of: The process of DNA sequencing the types/rates of errors in DNA sequence data A historical perspective of genome sequencing An understanding of the outcomes of the genome project and the post-genome challenges A introduction to some of the related ethical issues

Automated DNA Sequencing

Goal - To Read the Sequence of the Basepairs in a region of DNA

DNA Structure

DNA Sequencing: Process Overview Generation of a nested set of fragments Separation of the fragments Detection Analysis or base calling

Maxim-Gilbert Sequencing

DNA Replication helicase 5’ 3’ 5’ single stranded DNA binding proteins primosome primase 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ replicating DNA polymerase III active sites ligase RNA primer DNA polymerase I

The 3’ hydroxyl group is the point of attachment of the next base What happens if the 3’ OH is not there? X

Sanger Sequencing

An “AutoRad” of a Sequencing Gel ACGT

With 4-colors, all reaction can be run in one lane C A G T C A G T C G A T C G A T C G A T Label each with a different color Mix all reactions prior to loading

The Principle of 4-color Fluorescent DNA Sequencing

The Perkin Elmer/ABI 373 Fluorescence Based DNA Sequencer

A Sequencing Gel Image

Automated DNA Sequencing + - ACGTT…. A AC ACG ACGT ACGTT The technology for four-color automated DNA sequencing was developed in Dr. Leroy Hood’s lab at the California Institute of Technology in the early to mid 1980’s. The DNA which is to be sequenced is copied in such a way to generate fragments of all lengths starting at one end. Furthermore, the fragments are labeled fluorescent “tag” that varies in color depending on which base the fragments end - e.g. green for fragments that end on A, blue for fragments that end on C. The fragments are generated as a mixture. They are then separated by size in a process called gel electrophoresis. The mixture is placed on top of a gel which will act like a sieve. A voltage is applied to drive the mixture through the gel (DNA is negatively charged and will travel towards the “plus” end of the gel). Small fragments will travel through the gel more quickly than large fragments. A scanner near the bottom of the gel reads the fluorescence and stores the information in a computer. Software then processes this information to yield the sequence of bases in the original DNA. Automated sequencing helped build the company Applied BioSystems Inc. (ABI) into a large a powerful supplier of technology to the biotech industry. ABI was acquired by Perkin Elmer in 199_.

Gel Analysis Process Lane Finding - Look for local correlations in the vertical dimension Lane extraction - sum up pixel across the lanes, straighten if necessary Transform from wavelength domain to concentration domain Apply mobility and spacing correction Filter noise from data - low and high pass filters Find and identify peaks - numerical derivative Output called data

Raw Sequencing Data

Idealized Dye Spectra

Actual Dye Spectra

“Chromaticity” Transformation Measured - Signal in four filters (channels) Want - Signal in four concentrations [dye]=[fragments] basepairs 4 equations, 4 unknowns Matrix formulation I1 = a1[A] + c1[C] + g1[G] + t1[T] I2 = a2[A] + c2[C] + g2[G] + t2[T] I3 = a3[A] + c3[C] + g3[G] + t3[T] I4 = a4[A] + c4[C] + g4[G] + t4[T] I = {x} Conc Conc = {x}-1 I

Gel Analysis Process Lane Finding - Look for local correlations in the vertical dimension Lane extraction - sum up pixel across the lanes, straighten if necessary Transform from wavelength domain to concentration domain Apply mobility and spacing correction Filter noise from data - low and high pass filters Find and identify peaks - numerical derivative Output called data

Processed Electropherogram

Higher Voltages Produce Faster rates of Electrophoresis Speed is proportional to Voltage (V) Current (I) is depends on the resistance of the gel I=V/R Energy in Watts is W = V*I Thinner gels give higher R. Hence, thin or otherwise small gels must be used for higher voltages.

8 Capillary Array

Beckman CEQ 8000 DNA Sequencer 8 Capillary Array Linear polyacrylamide separation matrix 4 color terminator sequencing chemistry Windows NT based operating system

Beckman CEQ 8000

Different Labeling Chemistries can be used Dye Primer - dye is attached to the 5’ end of the sequencing primer. Dye Terminator - dye is attached to the ddNTP - allows all 4 reactions to be run in same tube. Internal Labeling - dye is attached to a dNTP - signal/molecule increases with length

Large Scale Sequencing

The (Human) Genome Project. The ultimate goal of the Human Genome Project is to decode, letter by letter, the exact sequence of all 3 billion nucleotide bases that make up the human genome. Just a single misplaced letter is sufficient to cause disease. GCTTACTGAGTACATGTGCTAATCGT 3,400,000,000 letters total

The (Human) Genome Project. Begun in 1990 with a 15 year budget of $3.0B overall. Goals: To obtain the sequences of human and model Organisms - E-Coli, Drosophila (fruit fly), C-Elegans (a worm), Yeast, Mouse Develop the necessary technologies to obtain the above.

Sizes and status of a sampling of Genomes

Overview of the goal

How do we begin to analyze a genome? We want DNA sequence for the entire genome (3.5 Bbp for human, 4Mbp for a bacterium). Sequencing allows one to read about 750 base pairs/sample. We need a method to sequence bigger pieces.

Primer Walking Vector Clone to sequence Primer Sequence New Primer Repeat

“Shotgun” sequencing ….GTCTACCTGTACTGATCTAGC... Sub- clone Copy Clone to sequence Sequence and “assemble” ….GTCTACCTGTACTGATCTAGC... …. CCTGTACTGATCTAGCATTA... …. GTACTGATCTAGCATTACG...

Shotgun vs. walking

Methods for very large scale sequencing A hierarchical approach Map on a large scale (physical mapping), sequence specific clones whose position in the genome is known Shot gun sequencing “Tear up” the genome and sequence random fragments until it is done Sequence tagged connectors (STC) Sequence the ends of many clones and use this info to pick overlapping clones

Making a genomic “library” Isolate DNA Fragment DNA Cells Clone { “Library”

Library Types Chromosome specific libraries Chromosomes can be sorted from one another based on size and GC content. Genomic Libraries - made from the entire genome. Large insert/small insert : combination of vector choice (YAC, BAC, plasmid, m13), fragmentation method (enzymatic, shearing, sonication), and size selection (by gel or other method).

Another view of a library Multiple copies of the genome (streched out) Randomly fragment and clone Can we order these fragments relative to one another?

Restriction Enzymes - 1970 Copyright 1998 Access Excellence www.gene.com

Physical Mapping : Digest and look for common features in clones B A B

Repeat many times to construct a physical map Pick a “minimal tiling path” Repeat many times to construct a physical map Sequence these mapped clones (typically by the shotgun method).

Path that was used for genome sequencing map (MBP) YACs BACs or Cosmids map (200kBP) m13, plasmid sequence (kbp)

“Shotgun” the genome Genome to sequence Sub- clone Sequence and “assemble” ….GTCTACCTGTACTGATCTAGC... …. CCTGTACTGATCTAGCATTA... …. GTACTGATCTAGCATTACG...

Sequence tagged connectors (STC) Genome to sequence Sub- clone Sequence the ends and store in a dB Sequence a clone, look for overlaps in the dB

Which method? Whole genome shot-gun Physical mapping STC Very successful for bacteria Celera’s approach to the human genome, but what about repeats? Physical mapping Traditional method STC Hybrid method, not a difficult as physical mapping, can resolve some issue with repeats.

Issues with genome sequencing Whose genome? Quality Contiguity Publication Patenting

What are the fruits of genome sequencing? A nearly complete list of genes A reference against which to compare other sequences Identify polymorphisms in the population Comparative genomics Identify highly conserved regions Evolutionary inferences A tremendously enabling reagent resource PCR primers Microarrays for expression SNP’s for genetic mapping