Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester.

Slides:



Advertisements
Similar presentations
MCB Lecture #15 Oct 23/14 De novo assemblies using PacBio.
Advertisements

Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Assembly.
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
Genome sequencing and assembling
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
NGS Bioinformatics Workshop 2
Genome sequencing and assembly Mayo/UIUC Summer Course in Computational Biology Genome sequencing and assembly.
Sequencing Data Quality Saulo Aflitos. Read (≈100bp) Contig (≈2Kbp) Scaffold (≈ 2Mbp) Pseudo Molecule (Super Scaffold) Paired-End Mate-Pair LowComplexityRegion.
Next generation sequencing Xusheng Wang 4/29/2010.
JAMES LINDSAY*, HAMED SALOOTI, ALEX ZELIKOVSKI, ION MANDOIU* ACM-BCB 2012 Scaffolding Large Genomes Using Integer Linear Programming University of Connecticut*Georgia.
De-novo Assembly Day 4.
CS 394C March 19, 2012 Tandy Warnow.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
Introduction to next generation sequencing Rolf Sommer Kaas.
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Opera: Reconstructing optimal genomic scaffolds with high- throughput paired-end sequences Song Gao, Niranjan Nagarajan, Wing-Kin Sung National University.
June 11, 2013 Intro to Bioinformatics – Assembling a Transcriptome Tom Doak Carrie Ganote National Center for Genome Analysis Support.
Meraculous: De Novo Genome Assembly with Short Paired-End Reads
Sequence assembly using paired- end short tags Pramila Ariyaratne Genome Institute of Singapore SOC-FOS-SICS Joint Workshop on Computational Analysis of.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Next Generation DNA Sequencing
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
The Changing Face of Sequencing
Towards your own genome. Designing your Sequencing Run Sequencing strategy Genome size and genome.
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
Problems of Genome Assembly James Yorke and Aleksey Zimin University of Maryland, College Park 1.
Jan Pačes Institute of Molecular Genetics AS CR
Gena Tang Pushkar Pande Tianjun Ye Xing Liu Racchit Thapliyal Robert Arthur Kevin Lee.
Bombus terrestris, the buff-tailed bumble bee Native to Europe A managed pollinator Commercially available Reared in greenhouses Important pollinator in.
Overview of the Drosophila modENCODE hybrid assemblies Wilson Leung01/2014.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
1.Data production 2.General outline of assembly strategy.
Human Genome.
billion-piece genome puzzle
University of Connecticut School of Engineering Assembler Reference Abyss Simpson et al., J. T., Wong, K., Jackman, S. D., Schein, J. E., Jones,
The Genome Assemblies of Tasmanian Devil Zemin Ning The Wellcome Trust Sanger Institute.
FuzzyPath - A Hybrid De novo Assembler using Solexa and 454 Short Reads Zemin Ning The Wellcome Trust Sanger Institute.
De Novo Genome Assembly - Introduction
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
ALLPATHS: De Novo Assembly of Whole-Genome Shotgun Microreads
Genome Research 12:1 (2002), Assembly algorithm outline ● Input and trimming ● Overlap detection ● Error correction ● Evaluation of alignments.
When the next-generation sequencing becomes the now- generation Lisa Zhang November 6th, 2012.
JERI DILTS SUZANNA KIM HEMA NAGRAJAN DEEPAK PURUSHOTHAM AMBILY SIVADAS AMIT RUPANI LEO WU Genome Assembly Final Results
Sequencing, de novo assembling, and annotating the genome of the endangered Chinese crocodile lizard, shinisaurus crocodilurus Jian gao, qiye li, zongji.
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es.
Cross_genome: Assembly Scaffolding using Cross-species Synteny
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
Denovo genome assembly of Moniliophthora roreri
Genome sequence assembly
Professors: Dr. Gribskov and Dr. Weil
Ssaha_pileup - a SNP/indel detection pipeline from new sequencing data
Sequencing technology and assembly
Henrik Lantz - NBIS/SciLife/Uppsala University
2nd (Next) Generation Sequencing
A critical evaluation of HTQC: a fast quality control toolkit for Illumina sequencing data Chandan Pal, PhD student Sahlgrenska Academy Institute of.
The ability of the SOP to sequence and identify unknown samples.
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
(A) Scale map of connections between contigs in Ver_v2 suggested by the alignment of paired-end Illumina reads (insert size, ~300 bp). (A) Scale map of.
IWGS workflow. iWGS workflow. A typical iWGS analysis consists of four steps: (1) data simulation (optional); (2) preprocessing (optional); (3) de novo.
Linear schematic of the basic quality control procedure for marker gene (microbiome) data. Linear schematic of the basic quality control procedure for.
Toward Accurate and Quantitative Comparative Metagenomics
Presentation transcript:

Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester ant Linepithema humile Argentine ant Atta cephalotesAcromyrmex echinatior Leafcutter ants

Now meet their genomes…

SpeciesCitationPlatform (Coverage) Assembly Program(s) Scaffold Length N50 (total) Harpegnathos saltator Jumping ant Bonasio et al 2010 Science Illumina (104x) SOAP de novo 6 lib.- 3 paired end, 3 mate pair 598 Kb (297 Mb) Camponotus floridanus Carpenter ant Bonasio et al 2010 Science Illumina (102x) SOAP de novo- 3 paired end, 3 mate pair 603 Kb (238 Mb) Acromyrmex echinatior Leafcutter ant Nygaard et al 2011 Genome Research Illumina (123x) SOAP de novo 5 lib.– 2 paired end, 3 mate pair 1.1 Mb (300Mb) Atta cephalotes Leafcutter ant Suen et al PNAS 454 (18-20x) Roche GS Assembler 5.1 Mb (317 Mb) Solenopsis invicta Fire ant Wurm et al PNAS Illumina (~55x) SOAP denovo + Roche GS Assembler 720 Kb (353 Mb) Linepithema humile Argentine ant Smith et al PNAS Illumina (23x) Roche GS Assembler + Celera CABOG 1.3 Mb (43 Mb) Pogonomyrmex barbatus Harvester ant Smith et al PNAS 454 (10-12x) Celera CABOG793 Kb (235 Mb)

Generic assembly procedure Assemble fragments into contigs Scaffolding– connecting contigs using mate-pair information

Steps involved in Illumina Assembly 1) Download data (qseq file– sequences with quality scores) 2) Filter data A) Filter low quality reads B) Trim adapter sequences 3) SOAPdenovo steps A) Preassembly error correction (Identify pairs of reads sharing a common sequence (k-mer, e.g ), estimate k-mer frequency, and remove erroneous k-mers) B) Construct contigs based on short insert libraries ( bp) C) Join contigs into scaffolds using information from large insert mate pair libraries (1Kb-10Kb) D) Do local reassembly of unresolved gap regions using Gap Closer for SOAPdenovo

2) Filtering data (specifics) A) Remove low quality reads – Remove reads that do not pass GA analysis Failed_Chastity filter (have an N in the last column of the GA export file) – Can use R BioConductor ShortRead package (may have to convert files from qseq to fastq format)BioConductor B) Remove adapter sequences – need adapter sequence information from person that did sequencing – Can use vectorstrip in EMBOSS

Computational power and time required for SOAPdenovo? Li et al 2010 Genome Research

And compared to other programs Lin et al 2011 Genomics

Acromyrmex echinatior genome raw data NCBI: SRA Acromyrmex genome Mate pair libraries (More redundant, To build scaffolds) Shotgun libraries (Broader coverage, To build contigs)

Paired end sequencing (<1Kb) Mate pair library, paired end sequencing (>1Kb)