Identification of large-scale genomic rearrangements between closely related organisms Bob Mau 1,2, Aaron Darling 1,3, Fred Blattner 4,5, Nicole Perna.

Slides:



Advertisements
Similar presentations
Sorting by reversals Bogdan Pasaniuc Dept. of Computer Science & Engineering.
Advertisements

Lateral Transfer. Donating Genes Mutation often disrupts the function of a gene Gene transfer is a way to give new functions to the recipient cell Thus,
Mechanisms of Genetic Variation 1 16 Copyright © McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display.
PCR, Viral and Bacterial Genetics
Locating conserved genes in whole genome scale Prudence Wong University of Liverpool June 2005 joint work with HL Chan, TW Lam, HF Ting, SM Yiu (HKU),
Greedy Algorithms CS 466 Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of.
CLEAN GENOME E. COLI – MULTIPLE DELETION STRAINS Gulpreet Kaur Microbial Biotechnology, Fall 2011.
Bacterial conjugation is the transfer of genetic material (conjugative plasmid) between bacteria through direct cell to cell contact, or through a bridge-like.
7 The Genetics of Bacteria and Their Viruses. 2 3 Plasmids Many DNA sequences in bacteria are mobile and can be transferred between individuals and among.
Whole Genome Alignment using Multithreaded Parallel Implementation Hyma S Murthy CMSC 838 Presentation.
Summer Bioinformatics Workshop 2008 Sequence Alignments Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University.
Introduction to Bioinformatics Algorithms Greedy Algorithms And Genome Rearrangements.
Sequence Alignments Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Genome Rearrangements CSCI : Computational Genomics Debra Goldberg
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
1 Genome Rearrangements João Meidanis São Paulo, Brazil December, 2004.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Alignment of Genomic Sequences Wen-Hsiung Li Ecology & Evolution Univ. of Chicago.
Aynaz Taheri 1 C. Gyles and P. Boerlin. * Transfer of foreign DNA * Mechanisms of transfer of DNA * Mobile genetic elements (MGE) * MGEs in the virulence.
Microbial Genetics Mutation Genetic Recombination Model organism
Novel computational methods for large scale genome comparison PhD Director: Dr. Xavier Messeguer Departament de Llenguatges i Sistemes Informàtics Universitat.
Bacterial Genetics Xiao-Kui GUO PhD.
Genetic transfer and recombination
Genetic exchange Mutations Genetic exchange: three mechanisms
L. 5: Prokaryotic Genetics. 2nd Biology ARA Lecture 5. GENETICS OF PROKARYOTES 1. Basic concepts 2. The prokaryotic genome 3. The pan-genome.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Chapter 7 The Genetics of Bacteria and Their Viruses
Genome Alignment. Alignment Methods Needleman-Wunsch (global) and Smith- Waterman (local) use dynamic programming Guaranteed to find an optimal alignment.
Gene & Genome Evolution1 Chapter 9 You will not be responsible for: Read the How We Know section on Counting Genes, and be able to discuss methodologies.
Microbial Models I: Genetics of Viruses and Bacteria 7 November, 2005 Text Chapter 18.
Pairwise Sequence Alignment BMI/CS 776 Mark Craven January 2002.
Hugh E. Williams and Justin Zobel IEEE Transactions on knowledge and data engineering Vol. 14, No. 1, January/February 2002 Presented by Jitimon Keinduangjun.
CHAPTER 5 The Genetics of Bacteria and Their Viruses CHAPTER 5 The Genetics of Bacteria and Their Viruses Copyright 2008 © W H Freeman and Company.
BACTERIAL TRANSPOSONS
Bacterial genetics and molecular biology. Terminology Genetics:Study of what genes are, how they carry information, how information is expressed, and.
Chapter 3 Computational Molecular Biology Michael Smith
Identifying conserved segments in rearranged and divergent genomes Bob Mau, Aaron Darling, Nicole T. Perna Presented by Aaron Darling.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
The Genetics of Bacteria and Their Viruses
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Lecture # 04 Cloning Vectors.
Identification of large-scale genomic rearrangements between closely related organisms Bob Mau 1,2, Aaron Darling 1,3, Fred Blattner 4,5, Nicole Perna.
Biodiversity. Genetic Mutations Change in base pairs Affect sequence May affect protein production Can alter genetic makeup within species.
 Learning Outcomes  To compare the mechanism of genetic recombination in bacteria  To describe the function of plasmids and transposons.
Genome Rearrangement By Ghada Badr Part I.
1 Genome Rearrangements (Lecture for CS498-CXZ Algorithms in Bioinformatics) Dec. 6, 2005 ChengXiang Zhai Department of Computer Science University of.
Microbial Models I: Genetics of Viruses and Bacteria 8 November, 2004 Text Chapter 18.
Microbial Genetics.  In bacteria genetic transfer (recombination) can happen three ways:  Transformation  Transduction  Conjugation  The result is.
1 Repeats!. 2 Introduction  A repeat family is a collection of repeats which appear multiple times in a genome.  Our objective is to identify all families.
Copyright © 2010 Pearson Education, Inc. MICROBIAL GENETICS Chapter 8.
Chapter 7 The Genetics of Bacteria and Their Viruses
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
Copyright © 2011 Pearson Education Inc. Lecture prepared by Mindy Miller-Kittrell, University of Tennessee, Knoxville M I C R O B I O L O G Y WITH DISEASES.
Visualizing Biosciences Genomics & Proteomics. “Scientists Complete Rough Draft of Human Genome” - New York Times, June 26, 2000 The problem: –3 billion.
Chapter 18.1 Contributors of Genetic Diversity in Bacteria.
MICROBIOLOGIA GENERALE
E.Coli AS MODERN VECTOR.
Microbial Genetics Eukaryotic microbes: fungi, yeasts Eukaryotic genome Chromosomal DNA Mitochondrial DNA Plasmids in yeast Prokaryotic.
TRANSFERIMIENTO LATERAL DE GENES
Virus Basics - part I Viruses are genetic parasites that are smaller than living cells. They are much more complex than molecules, but clearly not alive,
APPLICATION OF PHAGES IN BIOTECHNOLOGY TRANSDUCTION CRE LOX P SYSTEM
Horizontal gene transfer and the history of life
Mutations Chapter 12-4.
Extra chromosomal Agents Transposable elements
The Complete Genome Sequence of Escherichia coli K-12
GENETIC EXCHANGE BY NIKAM C.D. ASSISTANT PROFESSOR
Transposable Elements
E.Coli AS MODERN VECTOR.
Dissemination of Antibiotic Resistance Genomes
It is the presentation about the overview of DOT MATRIX and GAP PENALITY..
Presentation transcript:

Identification of large-scale genomic rearrangements between closely related organisms Bob Mau 1,2, Aaron Darling 1,3, Fred Blattner 4,5, Nicole Perna 1,5 Departments of Animal Health and Biomedical Sciences 1, Oncology 2, Computer Science 3, Laboratory of Genetics 4, Genome Center University of Wisconsin – Madison

The Amazing Variety of Diseases caused by E.coli strains in Bacterial Pathogenesis: A Molecular Approach “… is due to the fact different strains have acquired different sets of virulence genes. Most strains of E.coli are avirulent because they lack these virulence genes. E.coli is an excellent example of the maxim that it is the set of virulence genes carried by an organsims that make it a pathogen, not its species or genus designation.”

Categories of Bacterial Genome Evolution Local Single Base Mutations Indels (Small insertions and deletions Global (Large-scale) Rearrangements Inversions, translocations, inverted translocations Gene Gain and Loss Horizontal or Lateral Transfer Transformation, Transduction, and Conjugation Phage Integration Mobile Elements Transposons and Insertion Sequences Gene Duplication ( Mediated by mobile elements )

From the two E. coli genomes sequenced at the Blattner lab, we’ve identified: ~3900 genes common to both K-12 and O157:H7 528 genes unique to K genes unique to O157:H7 40 % of these genes are of unknown function. The primary reasons for these wholesale differences are: lateral transfer, phage integration, and one whopper of a duplication.

Strategy of Global Alignment of Two Highly Related Genomes: K O Partially Sorted Suffix Arrays STEP 1 Quickly find all 16-mer matches between genomes (K 1,O 1 ) : (K i,O i ) : (K n,O n ) STEP 2 Collapse consecutive pairs to form a collection of maximally exact matches. (MEMs) Use LIS algorithm to construct a collinear set of maximally ordered matches. STEP 3 Extend across intervening regions via anchored alignments from individual MEM endpoints Unique Insert Substitution

K-12 vs O157:H7 MEM Stats 43,235 total MEMs (  24 bps) 31,640 form maximal collinear subset The largest exact match is 2,632 bases 62 MEMs exceed 1000 bps Over 11,000 exceed 100 bps 18,212 single base differences (SNPs) Resulted in a segmentation of O157:H7 into 357 intervals of backbone or unique insert.

A Three-way Genomic Comparison: Parkhill et.al. Nature E. coli K-12 MG1655 S. Typhi CT18 S. Typhi- murium LT2

The “Traditional” WAY to view MEMs {(a 0,b 0 ),(a 1,b 1 ),…, (a K,b K )} for K+1 genomes For the reference genome G 0, a 0 < b 0 by convention. For the NON reference genomes, a k b k means the match occurs on the opposite strand (reverse complement)

A novel approach, wherein: Extensibility: works just as well for N as it does for 2 genomes, provided there is sufficient sequence similarity. Automatically identifies inversions, translocations, and inverted translocations Determines a maximal collinear subset within each locally collinear region, without recourse to an LIS step Very space efficient and very fast

Multiple Oriented Offset For each non-reference genome, determine the polarity with respect to G 0 As well as the offset: The Multiple Oriented Offset is the N vector:

Canonical MEM Equivalence Classes By appending the interval in reference genome coordinates: (a 0, b 0 ) to the Moo, the MEM is completely specified. We aggregate MEMs by their generalized offset, inducing a partition on the set of MEMs. This defines a CMemEC: {Moo,{(a 0 1, b 0 1 ), (a 0 2, b 0 2 ),…, (a 0 M, b 0 M )}}

In this example, it’s abundantly clear from the plot that there are two large rearrangements, one around the origin and the other about the terminus of replication. We could probably get by with modest extensions of existing methods (MUMmer or our earlier algorithm) to account for the large amounts of laterally transferred lineage-specific sequence.

In this example, it’s abundantly clear from the plot that there are two large rearrangements, one around the origin and the other about the terminus of replication. We could probably get by with modest extensions of existing methods (MUMmer or our earlier algorithm) to account for the large amounts of laterally transferred lineage-specific sequence. But, hey, biology ain’t easy...

Figure 1: Simplest Block and Strip Diagram G 1 : Strip 1 G 2 : Strip 2 G 3 : Strip G 4 : Strip G 0 : Reference Strip

Cut pt. Terminus Origin G 0 : Reference G 1 : Genome G 2 : Genome 2 G 3 : Genome G 5 : Genome 5 G 4 : Genome Figure 2: Example with Variable Block Lengths

Figure 1: Large-scale Genomic Rearrangements Genome 2 Genome 1 Zero Pt. Terminus Origin Genome 3 Genome 4 Genome 5 Species Tree MRCA

Figure 3: Segmentation Graph S(G 0 )

LOOk at the Picture and

Sorted Merge Lists of Six Enterobacterial strains MG1655 W3110 EDL933 Sakai CT18 LT2 Six SMLs of bimers, one for each genome. A bimer is the lexicographically lesser of an n-mer (we use n=23) and its reverse complement, together with an orientation flag. K-12 O157:H7 Typhi Typhimurium Escherichia coli Salmonella Enterica

C20 C21 C22 C22.5 C23 C24 C25 C1 C2 C3 C4 C5 C6 C7 A Transformation of CO92 to KIM by Inversions Near the Origin K5 K4 K3 K2 K1 K25 K24 K23 K22 K21 K20.5 K20 K19 K