Maize Production Sequencing

Slides:



Advertisements
Similar presentations
Advancing Science with DNA Sequence Maize Missouri 17 chromosome 10 project update Dan Rokhsar 3 October 2006.
Advertisements

Mo17 shotgun project Goal: sequence Mo17 gene space with inexpensive new technologies Datasets in progress: Four-phases of 454-FLX sequencing to max of.
Sequencing the Maize Genome Maize Genome Sequencing Consortium
Sequencing the Maize (B73) Genome
Accurate Assembly of Maize BACs Patrick S. Schnable Srinivas Aluru Iowa State University.
Maize Genetics, Genomics, Bioinformatics workshop
State-of-the-art France GBF-Toulouse Sequencing Team BAC selection and Finishing Murielle Philippot Pierre Frasse Genome Assembly Vincent Cahais Sana Hakim.
Celera Assembler Arthur L. Delcher Senior Research Scientist CBCB University of Maryland.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
Lecture 14 Genome sequencing projects
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
Genome Sequence Assembly: Algorithms and Issues Fiona Wong Jan. 22, 2003 ECS 289A.
DNA Sequencing – “Plus and Minus” Plus –Incubate with T4 DNA Polymerase and single dNTP –T4 Polymerase degrades 3’ ends in absence of dNTP –Fractionated.
Class 02: Whole genome sequencing. The seminal papers ``Is Whole Genome Sequencing Feasible?'' ``Whole-Genome DNA.
DNA Sequencing. The Walking Method 1.Build a very redundant library of BACs with sequenced clone- ends (cheap to build) 2.Sequence some “seed” clones.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Central Dogma Information storage in biological molecules DNA RNA Protein transcription translation replication.
International Tomato Finishing Workshop Wellcome Trust Sanger Institute April 2007 Wellcome Trust Medical Photographic Library.
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
16 and 20 February, 2004 Chapter 9 Genomics Mapping and characterizing whole genomes.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Whole Genome Assembly. WGA 1. Screener 2. Overlapper 3. Unitigger, 4. Scaffolder, 5. Repeat Resolver.
Genome sequencing and assembling
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Last lecture summary. recombinant DNA technology DNA polymerase (copy DNA), restriction endonucleases (cut DNA), ligases (join DNA) DNA cloning – vector.
BioInformatics (2). Physical Mapping - I Low resolution  Megabase-scale High resolution  Kilobase-scale or better Methods for low resolution mapping.
Mouse Genome Sequencing
Large-scale genome projects
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Tomato Chromosome 4: A Mapping & Sequencing Update 28 th September 2005 Christine Nicholson Mapping Core Group Welcome Trust Sanger Institute, UK.
Phred/Phrap/Consed Analysis A User’s View Arthur Gruber International Training Course on Bioinformatics Applied to Genomic Studies Rio de Janeiro 2001.
SOL 2008 October 12-16, Cologne, Germany CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966.
Next generation sequence data and de novo assembly For human genetics By Jaap van der Heijden.
Genome Sequencing in the Legumes Le et al Phylogeny Major sequencing efforts Minor sequencing efforts ~14 MY ~45 MY.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Sequencing a genome. Approximate Molecular Dynamics: New Algorithms with Applications in Protein Folding Author: Qun (Marc) Ma Predicting the 3D native.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
Solanum lycopersicum Chromosome 4 Sequencing Update UK-SOL– Dec 2008 Wellcome Trust Medical Photographic Library.
FINISHING WORKSHOP APRIL 2008 CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966 T0731 TM15.
Chromosome 12 M. Pietrella 1, G. Falcone 1, E. Fantini 1, A. Fiore 1, C. Perla 1, M.R. Ercolano 2, A. Barone 2, M.L. Chiusano 2, S. Grandillo 3, N. D’Agostino.
Wageningen, April 24-25, 2008 II Tomato Finishing Workshop Chromosome 12 Update ENEA, Rome University of Naples ‘Federico II’ CRIBI and Univ. of Padua.
HeterochromatinEuchromatin Relative chromosome length Relative bivalent diameter X 1.23 X 1.00 Relative area Relative optical density.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
1.Data production 2.General outline of assembly strategy.
Human Genome.
Maize Genome Project Shiran Pasternak January 13, 2006 Gramene SAB Meeting San Diego, CA Shiran Pasternak January 13, 2006 Gramene SAB Meeting San Diego,
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
Day Two. DAY TWO 9:00 – 9:10Recap of day one 9:10 – 9:55TOPAAS demo (Sander) 9:55 – 10:15Coffee break 10:30 – 11:30New Technology Data 11:30 – 12:30High.
13 th January 2008 Plant & Animal Genome Conference Progress with Sequencing Tomato Chromosome 4 Clare Riddle Tomato Project Group Wellcome Trust Sanger.
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
1. Assembly by alignment Instead of overlap-layout-consensus we use alignment-consensus 2.
Genome Analysis Assaad text book slides only Lectures by F. Assaad can be downlaoded from muenchen.de/~farhah/index.htm.
Genome representation and variant identification Deanna M. Church, NCBI.
16 th April 2007 Christine Nicholson, Mapping Core Group Wellcome Trust Sanger Institute Tomato Chromosome 4 Mapping & Use of FPC Copyright Wellcome Trust.
Sequencing Chromosome 12. runs db (blast) SOL dbrelational db Choice of suitable seed BACs Running 96 samples For each BAC check db update db update dbcheck.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
Virginia Commonwealth University
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
Genome sequence assembly
Pre-genomic era: finding your own clones
CSE182-L12 Gene Finding.
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Presentation transcript:

Maize Production Sequencing

Maize Production Goals BAC End Sequencing of 220,000 Clones Fosmid End Sequencing of 500,000 Clones Shotgun of 16,000 BAC Clones

Maize BAC End Sequences 580,000 reads processed 567 average read length 60% success

Maize Fosmid End Sequences 850,000 processed 79% success 543 average read length Completed today

Library Construction Pipeline Receipt of sheared DNA from AGI Size selection of insert DNA Ligation into pSMART vector

Constructed 17,034 Libraries as of August 31st

Average Fail Rate for Library Construction was less than 5%

3.5X coverage Clone size verification 50% paired ends BES agreement 25% of clones failed 22% need more data 3% BES disagreement Shotgun Criteria

Shotgun Complete for 12,211 Clones as of August 31st

Final Production Work 660 Clones Need Library Construction 2100 Clones In Production Pipeline Expected Completion Date December 2007

Sequence Improvement Bob Fulton Dick McCombie Rod Wing

Sequence Improvement Pipeline Shotgun_done triggers the prefinishing pipeline Initial identification of do finish regions Manual sorting and use of autoedit(Gordon) to break apart misassembly. Autofinish(Gordon) used to choose directed reactions for all gaps and regions of low quality in do finish regions Reassembly and 2nd iteration of prefinishing pipeline Final identification of do finish regions and handoff to finishing pipeline

Clone Improvement through the Prefinishing Pipeline

End Spanning Plasmids Coverage (green) Assembly View-Entire Clone

Repeat Tags Do Finish GSS sequence EST sequence Assembly View-Do Finish Region

Alignment with cDNA read pairs Alignment with End Sequences

ActualProjected

Maize GenBank Submissions Joanne Nelson

Submission Landmarks HTGS_FULLTOP HTGS_PREFIN HTGS_ACTIVEFIN HTGS_IMPROVED

Improved Sequence Non-repetitve portions of the sequence have had sequence improvement (directed attempts) and have been labeled as improved. Improved regions are double stranded, sequenced with an alternate chemistry or covered by high quality data (i.e. phred quality greater than or equal to 30 or approval by an experienced finisher), unless otherwise noted. Regions of low sequence complexity (such as dinucleotide repeats and small unit tandem repeats) in the improved regions have not been resolved to previously established finishing standards. BAC end sequence, cot and methyl filtered genome survey sequence and data from overlapping projects of strain B73 may have been included in this project. Where possible, contigs have been ordered and oriented based on read pairing. These regions are designated as scaffolds. Additional order and orientation will be provided upon completion of detailed analysis of the complete finished tiling path.

Improved Sequence FEATURES Location/Qualifiers source /organism="Zea mays" /mol_type="genomic DNA" /db_xref="taxon:4577" /chromosome="1" /clone="CH J17; ZMMBBc0132J17" misc_feature /note="scaffold_name:Scaffold1" misc_feature /note="assembly_name:Contig28 vector_side:SP6" misc_feature /note="Improved sequence." unsure /note="Non-repetitive but unresolved region" gap /estimated_length=unknown misc_feature /note="assembly_name:Contig27" misc_feature /note="Improved sequence." unsure /note="Non-repetitive but unresolved region" misc_feature /note="Improved sequence." gap /estimated_length=unknown misc_feature /note="assembly_name:Contig14" gap /estimated_length=unknown misc_feature /note="scaffold_name:Scaffold2

Submission Totals HTGS_FULLTOP3342 HTGS_PREFIN2014 HTGS_ACTIVEFIN4151 HTGS_IMPROVED2660 TOTAL 12167