Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.

Slides:



Advertisements
Similar presentations
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Advertisements

SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
Lecture 14 Genome sequencing projects
9 Genomics and Beyond Brief Chapter Outline
CS273a Lecture 4, Autumn 08, Batzoglou Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector.
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
Genome Sequence Assembly: Algorithms and Issues Fiona Wong Jan. 22, 2003 ECS 289A.
Class 02: Whole genome sequencing. The seminal papers ``Is Whole Genome Sequencing Feasible?'' ``Whole-Genome DNA.
CS262 Lecture 11, Win07, Batzoglou Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector the.
DNA Sequencing. The Walking Method 1.Build a very redundant library of BACs with sequenced clone- ends (cheap to build) 2.Sequence some “seed” clones.
Stuff to Do. Midterm I questions due 1/31 me your question (with answers), –if you have the capability, mail complete questions, figures, etc. and.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
DNA Sequencing and Assembly
The Human Genome Race. Collins vs. Venter Collins Venter.
Sequencing and Assembly Cont’d. CS273a Lecture 5, Win07, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
16 and 20 February, 2004 Chapter 9 Genomics Mapping and characterizing whole genomes.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
DNA Sequencing and Assembly. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA.
DNA Sequencing.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
DNA Sequencing. CS273a Lecture 3, Autumn 08, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
CS273a Lecture 4, Autumn 08, Batzoglou DNA Sequencing.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
Genome sequencing and assembling
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
HAPLOID GENOME SIZES (DNA PER HAPLOID CELL) Size rangeExample speciesEx. Size BACTERIA1-10 Mb E. coli: Mb FUNGI10-40 Mb S. cerevisiae 13 Mb INSECTS.
Mouse Genome Sequencing
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
Last lecture summary. recombinant DNA technology DNA polymerase (copy DNA), restriction endonucleases (cut DNA), ligases (join DNA) DNA cloning – vector.
Genome sequencing Haixu Tang School of Informatics.
Genome Sequencing in the Legumes Le et al Phylogeny Major sequencing efforts Minor sequencing efforts ~14 MY ~45 MY.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Sequencing a genome. Approximate Molecular Dynamics: New Algorithms with Applications in Protein Folding Author: Qun (Marc) Ma Predicting the 3D native.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
Status report on gap closure of the human chromosome 5 BAC map Authentication of C5 BAC maps Map and sequence status Gap status and steps used to close.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
Problems of Genome Assembly James Yorke and Aleksey Zimin University of Maryland, College Park 1.
Finishing tomato chromosomes #6 and #12 using a Next Generation whole genome shotgun approach Roeland van Ham, CBSG, NL René Klein Lankhorst, EUSOL Giovanni.
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
Human Genome.
Molecular Biology II Lecture 1 OrR. Restriction Endonuclease (sticky end)
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
Genome Analysis Assaad text book slides only Lectures by F. Assaad can be downlaoded from muenchen.de/~farhah/index.htm.
16 th April 2007 Christine Nicholson, Mapping Core Group Wellcome Trust Sanger Institute Tomato Chromosome 4 Mapping & Use of FPC Copyright Wellcome Trust.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
Genome Analysis. This involves finding out the: order of the bases in the DNA location of genes parts of the DNA that controls the activity of the genes.
Objectives: Outline the steps involved in sequencing the genome of an organism. Outline how gene sequencing allows for genome wide comparisons between.
DNA Sequencing Project
Genome sequence assembly
Pre-genomic era: finding your own clones
Construction of clone-based physical maps suitable for sequencing the human genome. The first-generation physical maps of human chromosomes constructed.
Stephen W Scherer, Joseph Cheung  Current Biology 
Stuff to Do.
Databases BI420 – Introduction to Bioinformatics Gabor T. Marth
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
Databases BI420 – Introduction to Bioinformatics Gabor T. Marth
Sequence the 3 billion base pairs of human
Assembly of Solexa tomato reads
Human Genome Project Seminal achievement. Scientific milestone.
Presentation transcript:

Genome sequencing

Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli Library: collection of fragments of a genome in cloning vectors Draft: crude 1 st generation sequence assembly Scaffold: Sequences which are anchored to a genetic map

Vocabulary 2 Minimal tiling path: Minimal set of overlapping clones that together provides complete coverage across a genomic region Coverage: The number of times a genomic region is represented in a collection of clones or sequence reads Contig: Alignment of overlapping reads 'N50 length‘ is defined as the largest length L such that 50% of all nucleotides are contained in contigs of size at least L

Bac by Bac Whole genome shotgun

Bac by Bac sequencing (slow)

Minimal tiling path

Whole genome shotgun sequencing WGSA

Hybrid shotgun sequencing

N 50 Cumulative contig content in % of genome Contig size (in kb) Order contigs according to size Compute cumulative size N50 = contig size (sequence length) which marks 50% of genome content

Human genome 2001: 2 Draft sequences published Public Bac by Bac sequence Celeras WGSA –90% of euchromatic sequence – gaps –N 50 : 81 kb –Error rate: 1: Finished public sequence –99 % of euchromatic sequence –341 gaps –N 50 : kb –Error rate: 1:

The problem with complex genomes Gaps Orientation of contigs not known Near identical repeats hard to resolve

Finishing the sequence GapDraft sequence

Resolving repeats

Detecting and resolving repeats in WGSA

Clone orientation

Segmental duplications / gaps Blue: duplications of size > 10kb Red: Gaps of size > 300 kb