Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.

Slides:



Advertisements
Similar presentations
Celera Assembler Arthur L. Delcher Senior Research Scientist CBCB University of Maryland.
Advertisements

Doug Brutlag 2011 Sequencing the Human Genome Doug Brutlag Professor Emeritus of Biochemistry.
Intro to DNA Sequencing
DNA Sequencing.
A Lot More Advanced Biotechnology Tools DNA Sequencing.
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
Some new sequencing technologies. Molecular Inversion Probes.
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.
Genomics Complete Genomes in The Public DataBases >100 Non-Eukaryotes Eukaryotes: Leishmania 257 Kb 79 orfs Plasmodium falciparum I 947 Kb 205.
DNA Sequencing. DNA sequencing … ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
Mid-term Examination Between October 16th to 30th 2006
Genomics. Gene expression DNA (Genome) pre-mRNA mRNA mRNA (Transcriptome) Proteins (Proteome) Metabolites (Metabolome) Regulation Nucleus Cytoplasm Chromatography.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 4 Genome Sequencing Strategies and procedures for.
16.6 – Locating and Sequencing Genes. Learning Objectives Recap how DNA probes and DNA hybridisation is used to locate specific genes. Learn how the exact.
Emily Buckhouse. Nitrogenous Bases Nucleosides  Base linked to a 2-deoxy-D-ribose at 1’ carbon Nucleotides Nucleosides with a phosphate at 5’ carbon.
7.1 cont’d: Sanger Sequencing SBI4UP MRS. FRANKLIN.
Automated DNA Sequencing LECTURE 7: Biotechnology; 3 Credit hours Atta-ur-Rahman School of Applied Biosciences (ASAB) National University of Sciences and.
Next generation sequencing Xusheng Wang 4/29/2010.
Recombinant DNA Technology for the non- science major.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
High Throughput Sequencing Methods and Concepts
Genome Characterization Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Ch 9.
AP Biology A Lot More Advanced Biotechnology Tools Sequencing.
Phred/Phrap/Consed Analysis A User’s View Arthur Gruber International Training Course on Bioinformatics Applied to Genomic Studies Rio de Janeiro 2001.
Applications of DNA technology
Bioinformatics and Sequencing Relevant to SolCAP
A Lot More Advanced Biotechnology Tools (Part 1) Sequencing.
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
1 Chapter 2: DNA replication and applications DNA replication in the cell Polymerase chain reaction (PCR) Sequence analysis of DNA.
Genetics 7: Analyzing DNA Sequences DNA Sequencing Determining base by base the nucleotide sequence of a fragment of DNA.
Chapter 5: Exploring Genes and Genomes Copyright © 2007 by W. H. Freeman and Company Berg Tymoczko Stryer Biochemistry Sixth Edition.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
Human Genome.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
A Lot More Advanced Biotechnology Tools Sequencing.
IB Cloning and Sequencing - Jan 23
Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism.
Sanger or Dideoxy DNA Sequencing
DNA Sequencing Mimi Chen & Joanne Kim
Chapter 5 Sequence Assembly: Assembling the Human Genome.
Genome sequencing and annotation Week 2 reading assignment - pages 63-78, 93-98, Boxes 2.1 and don’t worry about details of similarity scoring.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
DNA Sequencing First generation techniques
Virginia Commonwealth University
Next generation sequencing
Sequencing technologies
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Genomics Sequencing genomes.
DNA Sequencing Techniques
Joseph E. Conley, Alex J. Meisel, and James J
Basic Techniques Project Design Process Improvements
Genetic Research and Biotechnology
The Human Genome Project
SOLEXA aka: Sequencing by Synthesis
DNA Sequence Determination (Sanger)
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
DNA and the Genome Key Area 8a Genomic Sequencing.
Molecular Biology lecture -Putnoky
A Sequenciação em Análises Clínicas
Matthew 13:17 17 For verily I say unto you, That many prophets and righteous men have desired to see those things which ye see, and have not seen them;
Introduction to Sequencing
Sequence the 3 billion base pairs of human
A Lot More Advanced Biotechnology Tools
Presentation transcript:

Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006 review paper Assigned listening: Ecic Lander genomics lecture

DNA Sequence Project Size/Type 500 bases 2500 bases 10 kbp 150 kbp 3 Mbp –simple –repeats 3 Gbp 31 Gbp 1 EST,STS whole cDNA/EST Gene, virus BAC, big virus Bacterial genome, YAC-size Human, mouse Salamander

Metazoan genome sizes Nematode (Caenorhabditis elegans): 100 Mb Thale cress (Arabidopsis thaliana): 160 Mb Fruit fly (Drosophila melanogaster): 180 Mb Puffer fish (Takifugu rubripes): 400 Mb Rice (Oryza sativa): 490 Mb Human (Homo sapiens): 3.5 Gb Leopard frog (Rana pipiens): 6.5 Gb Onion (Allium cepa):16.4 Gb Mountain grasshopper(Podisma pedestris):16.5 Gb Tiger salamander (Ambystoma tigrinum):31 Gb Easter lily (Lilium longiflorum): 34 Gb Marbled lungfish (Protopterus aethiopicus):130 Gb

DNA Sequencing Methods Chain termination/Dideoxy/Sanger ABI –Fluorescence paradigm, ABI –Main method Next generation sequencing –Polymerase addition sequencing –454 Sequencing, Illumina Affymetrix –Chips: Affymetrix

Dideoxy / Chain Terminator / Sanger Template Primer Extension Chemistry –polymerase –termination –labeling Separation Detection

Chain Terminator Basics Target Template-Primer Extend ddA ddG ddC ddT Labeled Terminators ddA AddC ACddG ACGddT TGCA dN : ddN 100 : 1

Electrophoresis Sequencing Reaction products Polyacrylamide Gel Electrophoresis (PAGE)‏

DNA sequencing trace file

Separation Gel Electrophoresis Capillary Electrophoresis –suited to automation rapid (2 hrs vs 12 hrs)‏ re-usable simple temperature control 96 well format

Paradigm Instrument Applied Biosystems –ABI3730XL (2002, 96 samples, 1000 base reads, ~$350,000, higher sensitivity, lower reagent cost, ~$1/reaction)‏ –700 Kbp / 24 hours. 384 capillary sequencers –5700 sequences / 24 hr day –2.8 Mbp / 24 hours.

384-well capillary sequencing Results are shown as an electropherogram showing a peak for each base. From the peak heights and widths, a Phred score is assigned to each individual base. A high Phred score indicates a high certainty as to the identity of that particular base.

Sample Output 1 lane

1 trace=1000 bases or less –ABI: 1000 bp reads –Illumina: bp reads –454 Sequencing: bp reads How do we cover a genome? –DIVIDE AND CONQUER: assemble these short sequence fragments.

Assembly/Trace Editing Consed –UNIX EBI’s Phusion EditView (ABI PRISM)‏ –Mac Chromas (free/pay versions)‏ –Windows

Sequencing Strategies Ordered –Divide and Conquer Random Sequence –Brute Force The random approach now predominates for big projects

Random Method (details for Sanger seq) Shear DNA (nebulize)‏ –finish ends, ligate into vector Produce template Sequence to 8X – 10X coverage –Sequence both ends of templates. –Read length (1,000bp typical)‏ –Accuracy (99% good)‏

Assembly Problem CONTIG

Contigs, Islands contigs Island

Assembling random sequences No coverage Only 1 strand DISAGREEMENT T T C

Assembly programs Celera Assembler (Eugene Myers et al.) Arachne (Serafim Batzoglou et al.) PCAP (Xiaoqiu Huang, Iowa State University) Phusion (EBI)

Continuing rapid improvement in sequencing technology

1990’s: Human genome 3Gbps, $300 million (just sequencing)‏ Current: Mammalian genome (3 Gbps): $1 million Goal: $100,000 genome, 10X cheaper (and faster)‏ likely 2012! New goal! $1,000 genome. UK’s sequencing center has one:

454 Sequencing’s Genome Sequencer FLX Pyrosequencing (sequencing by detection of nucleotides added during DNA synthesis million bases per run (10 hrs.). 400 bp sequence reads. 1,000,000 reads per run. $6,600 per run, 60kb/$1, or $ /bp.