Automatic DNA and Genome Sequencing

Slides:



Advertisements
Similar presentations
PCR, Gel Electrophoresis, and Southern Blotting
Advertisements

How do we analyze DNA? Gel electrophoresis Restriction digestion
GENOME SEQUENCING AND OBJECTIVES
SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
9 Genomics and Beyond Brief Chapter Outline
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
DNA Sequencing – “Plus and Minus” Plus –Incubate with T4 DNA Polymerase and single dNTP –T4 Polymerase degrades 3’ ends in absence of dNTP –Fractionated.
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
DNA Sequencing. The Walking Method 1.Build a very redundant library of BACs with sequenced clone- ends (cheap to build) 2.Sequence some “seed” clones.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
3 September, 2004 Chapter 20 Methods: Nucleic Acids.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Genome sequencing and assembling
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Last lecture summary. recombinant DNA technology DNA polymerase (copy DNA), restriction endonucleases (cut DNA), ligases (join DNA) DNA cloning – vector.
Today’s Lecture Genetic mapping studies: two approaches
Reading the Blueprint of Life
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
Chapter 19 – Molecular Genetic Analysis and Biotechnology
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
Genome Sequencing and Annotation (Part 1).
Applications of DNA technology
Technological Solutions. In 1977 Sanger et al. were able to work out the complete nucleotide sequence in a virus – (Phage 0X174) This breakthrough allowed.
Manipulating DNA.
Restriction Nucleases Cut at specific recognition sequence Fragments with same cohesive ends can be joined.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 3 Fundamentals of Mapping and Sequencing Basic principles.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
Human Genome.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
GENETIC ENGINEERING CHAPTER 20
Chapter 10: Genetic Engineering- A Revolution in Molecular Biology.
Locating and sequencing genes
Molecular Basis for Relationship between Genotype and Phenotype DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription.
Molecular Tools. Recombinant DNA Restriction enzymes Vectors Ligase and other enzymes.
Molecular Basis for Relationship between Genotype and Phenotype DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription.
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
 Cell that does no have a nucleus or other membrane bound organelles.
The genetic engineers toolkit A brief overview of some of the techniques commonly used.
Genome sequencing and annotation Week 2 reading assignment - pages 63-78, 93-98, Boxes 2.1 and don’t worry about details of similarity scoring.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
DNA Sequencing First generation techniques
Virginia Commonwealth University
DNA Technologies (Introduction)
Genomics Sequencing genomes.
6. Sequencing Genomes.
Sequencing Technologies
Biotechnology CHAPTER 20.
AMPLIFYING AND ANALYZING DNA.
Chapter 20 – DNA Technology and Genomics
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
The Human Genome Project
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
Southern Blotting.
DNA and the Genome Key Area 8a Genomic Sequencing.
A Sequenciação em Análises Clínicas
Introduction to Sequencing
Sequence the 3 billion base pairs of human
9-3 DNA Typing with Tandem Repeats
Relationship between Genotype and Phenotype
Presentation transcript:

Automatic DNA and Genome Sequencing Yuki Juan 2003.5.5

Genetic Mapping

Automated DNA Sequencing Principle of Sanger Sequencing High-Throughput Sequencing Reading Sequence Traces Contig Assembly Emerging Sequencing Methods

http://www.mun.ca/biology/scarr/4241chaptertwo/Biology4241chaptertwo/Chapter2GenomeSequencingandAnnotation.htm#sanger

The First Cycle in PCR

The Second Cycle in PCR

The Third Cycle in PCR

The Principle of Dideoxy (Sanger) Sequencing Basic chain terination method developed in 1974 by Frederick Sanger

Strategy of the Chain-termination Method for Sequencing DNA

Strategy of the Chain-termination Method for Sequencing DNA

Fluorescence Detection

High-Throughput Sequencing The new techniques and equipment include: Four-color fluorescent dyes have replaced the radioactive label. Automatical trace reading Improvement in the chemistry of template purification and the sequencing reaction. Capillary electrophresis

Automated Sequencing Method

Automated Sequencing Method

ABI PRISM® 3700 DNA Sequencer

ABI PRISM® 3700 DNA Sequencer Price: $65,50 A fully automated, multi-capillary electrophoresis instrument designed Automatically analyze multiple runs of 96 samples

MegaBACE 1000 DNA Sequencer

MegaBACE 1000 DNA Sequencer An automated machine capable of high-throughput DNA analysis, processing 96 samples in just a few short hours. Applications : DNA sequencing genotyping fragment analysis. 6 arrays of 16 capillaries with an interior diameter of about 100 µm. The system uses high-pressure nitrogen gas to inject the capillaries with Linear Polyacrylamide, a denaturing gel.

Reading Sequence Traces Base-calling Using automated software Phred program developed at the University of Washington.

Phred Program use algorithms to convert trace files into base sequences and assign quality values to each base call in the sequence

The Phred Base-calling Algorithm

Automated Sequence Chromatograms SNP: Single nucleotide polymorphism

Phred Quality Value Distributions Dark blue: Bases 100-400 in each sequence Light blue: All bases The predicted error rate increases for longer fragments

Contig Assembly Contig: A contiguous (touching; adjoining) stretch of cloned DNA The finishing step in sequencing a multi-clone stretch of DNA, and involves alignment, editing, and error correction. Sequence editing software(from the University of Washington) phrap assembler consed graphic editor

Phrap assembler http://www. mrc-lmb. cam. ac

An aligned reads window in Consed

Alignment algorithms The Needleman-Wunsch method (1970) was the first computationally feasible algorithm for sequence alignment.

Alignments based on these algorithms may vary due to differences in the weighting of their default parameters. weighting of the effects of indels relative to single base mismatches weighting attached to quality scores of bases from contributing sequenc weighting attached to frequency of mismatches

Emerging Sequencing Methods Sequencing by Hybridization (SBH) Mass Spectrophotometric Sequences Direct Visualization of Single DNA Molecules by Atomic Force Microscopy (AFM) Single Molecule Sequencing Techniques Single Nucleotide Cutting

Sequencing by Hybridization (SBH) Uses the complementarity of the two strands of DNA molecules to determine if a match to an oligonucleotides is present in the DNA. Possible for short sequences

Mass Spectrophotometric Sequences fragmented oligonucleotides can be identified by time of flight through a vacuum chamber Useful for fragmented DNA molecules under 50 bases long Likely possible to determine full sequence of molecule divided into all possible oligonucleotides Methods fast, and should become cheap

Direct Visualization of Single DNA Molecules by AFM Can observe bumps in ssDNA, but not resolve bases Possibly hybridize molecule to Oligonucleotides with bulky modified side groups

Single Molecule Sequencing Technique Extremely fast and relatively cheap Can accommodate long DNA fragments Nanopore sequencing

Single-molecule Nanopore Sequencing

Nanopore Sequencing Protein pore channel in electrically polarized membrane Single DNA molecule pulled through by electrophoreses Nucleotides transiently block ion movement, resulting in drop in current resolutio If slowed to about 1 base per millisecond, could sequence 1Kb per second, three orders of magnitude faster than capillary sequencers

Single Nucleotide Cutting Can suspend long strand of DNA in a vacuum by molecular tweezers Exonuclease molecule cuts off single nucleotides to be read by fluorescent signal or imprinting on grid

Genome Sequencing Hierarchical Sequencing Shotgun Sequencing Sequence Verification

Hierarchical versus Shotgun Sequencing

Hierarchical versus Shotgun Sequencing Both processes involve fragmenting the genome and aligning fragments due to overlapping sequences. Both aim for 5-10x redundancy in sequence representation. Main difference is that hierarchical sequencing attempts to align large cloned fragments (~100kb) into a tiling path. shotgun sequencing omits this step. The entire genome is fragmented into small pieces which are then aligned using computer algorithms. Hierarchical sequencing was the basis of the publicly funded Human Genome Project. Shotgun sequencing was the basis of the privately funded Human Genome Project

Hierarchical Sequencing Also known as top down,  map based, or clone by clone sequencing Steps involved: Shear DNA into manageable units (50 - 200 kb) * This is accomplished by sonication * Amplification (PCR) * Clone into vector of choice (BAC'S usually) Create DNA library * aim for 5-10x redundancy Selection of a tiling path

Cloning Vectors Using in Genome Sequencing

Hierarchical Assembly of a Sequence-contig Scaffold

The Tiling Path Cab be assembled using a combination of three methods Hybridization Fingerprinting End-sequencing

Hybridization Create probes for specific sequences Often uses robots to replicate plate clones that show probe hybridization The genome can be probed for many different sequences, leading to islands of overlapping clones that will be joined later in the process. Chromosome walking - use the end sequence of a clone to create a probe for an adjacent clone.

Fingerprinting Use restriction digest profile to determine sequence overlap Done by complex computer algorithms

Alignment BAC clones by Hybridization and Fingerprinting

End-sequencing Sequence the end of BAC clones Create a probe for that end sequence, and hope that it hybridizes near the middle of another clone

Assembly of The Draft Genome 3 steps: Filtering Removal of contaminating fragments They may be bacterial in origin, or clones that show evidence of recombination. Assembling the Layout generating and ordering each BAC contig Position of each contig can be confirmed by alignment with previously characterized Sequence Tagged Sites (STS) Merging Aligning BAC contigs that are known to be adjacent to each other

Shotgun Sequencing Computer algorithms are used to assemble contigs from thousands of overlapping sequences

Tasks performed by Computational Algorithms Screener Overlapper Unitigger Scaffolder

Screener Masks (marks & hide) sequences that contain repetitive DNA. e.g. Microsatellites, ALU repeats, ribosomal DNA These sequences are not taken into consideration when determining overlap

Overlapper Compares every unscreened read against every other unscreened read Is essentially the same as performing a BLAST search Searches for overlap of a predetermined length (40 bp for Human Genome Project)

Blast Output

A local Alignment

Unitigger Unitig: a contig formed from a series of overlapping unambigously unique sequences

U-unitigs and Repeat Resolution

Scaffolder Uses mate-pair information to link U-unitigs into scaffold contigs Most of the remaining gaps at this point are due to repeat elements, and can be resolved by the following method: Unitigs that were not classified as U-unitigs are placed in the gaps. These are often referred to as overcollapsed unitigs If their placement is supported by two or more mate-pairs, it is referred to as a ROCK. If their placement is supported by one mate-pair, it is referred to as a STONE. Small gaps can be filled in by chromosome walking

Assembly of a Mapped Scaffold

Proportion of Fly and Human Genomes in Large Scaffolds

Sequence Verification Completeness Accuracy Validity of assembly

Alignment of Two Draft Human Genome Assemblies