Original Synteny Vincent Ferretti, Joseph H. Nadeau, David Sankoff, 1996 Presented by: Suzy Sun.

Slides:



Advertisements
Similar presentations
School of CSE, Georgia Tech
Advertisements

. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Greedy Algorithms CS 466 Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Bioinformatics Chromosome rearrangements Chromosome and genome comparison versus gene comparison Permutations and breakpoint graphs Transforming Men into.
Genome Halving – work in progress Fulton Wang ACGT Group Meeting.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Genome Rearrangements CIS 667 April 13, Genome Rearrangements We have seen how differences in genes at the sequence level can be used to infer evolutionary.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
Introduction to Bioinformatics Algorithms Greedy Algorithms And Genome Rearrangements.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
A General Framework for Track Assignment in Multilayer Channel Routing (Multi layer routing) -VLSI Layout Algorithm KAZY NOOR –E- ALAM SIDDIQUEE
Of Mice and Men Learning from genome reversal findings Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes and Transforming.
Genome Rearrangements CSCI : Computational Genomics Debra Goldberg
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Phylogenetic trees Sushmita Roy BMI/CS 576
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Combinatorial and Statistical Approaches in Gene Rearrangement Analysis Jijun Tang Computer Science and Engineering University of South Carolina
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetics II.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals.
Genome Rearrangements [1] Ch Types of Rearrangements Reversal Translocation
Greedy Algorithms And Genome Rearrangements
Sorting by Cuts, Joins and Whole Chromosome Duplications
1 N -Queens via Relaxation Labeling Ilana Koreh ( ) Luba Rashkovsky ( )
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Chap. 7 Genome Rearrangements Introduction to Computational Molecular Biology Chapter 7.1~7.2.4.
Greedy Algorithms CS 498 SS Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix.
1 1 © 2003 Thomson  /South-Western Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
7. Lecture WS 2003/04Bioinformatics III1 Genome-scale evolution: multiple genome rearrangement, phylogeny based on whole genome sequence Material of this.
Chapter 9 Finding the Optimum 9.1 Finding the Best Tree.
Genome Rearrangement By Ghada Badr Part I.
1 Genome Rearrangements (Lecture for CS498-CXZ Algorithms in Bioinformatics) Dec. 6, 2005 ChengXiang Zhai Department of Computer Science University of.
1 Schnyder’s Method. 2 Motivation Given a planar graph, we want to embed it in a grid We want the grid to be relatively small And we want an efficient.
Genetic Algorithm(GA)
WABI: Workshop on Algorithms in Bioinformatics
St. Edward’s University
Inferring a phylogeny is an estimation procedure.
The Greedy Method and Text Compression
Character-Based Phylogeny Reconstruction
Multiple Alignment and Phylogenetic Trees
Lecture 3: Genome Rearrangements and Duplications
Algorithmic Problems Related to Sequences and Phylogenetic Trees
Analysis of Algorithms
Phylogenetic Trees.
Chapter 2 Basic Models for the Location Problem
Mattew Mazowita, Lani Haque, and David Sankoff
Greedy Algorithms And Genome Rearrangements
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Multiple Genome Rearrangement
Lecture 7 – Algorithmic Approaches
Phylogeny.
JAKUB KOVÁĆ, ROBERT WARREN, MARÍLIA D.V. BRAGA and JENS STOYE
Towards Identifying Lateral Gene Transfer Events
Rearrangement Phylogeny of Genomes in Contig form
Locality In Distributed Graph Algorithms
Minimum Spanning Trees
Presentation transcript:

Original Synteny Vincent Ferretti, Joseph H. Nadeau, David Sankoff, 1996 Presented by: Suzy Sun

Synteny: Two genes are syntenic if they are assigned to the same chromosome

Introduction We know more about chomosomal gene assignment rather than where exactly these genes are located on the chromosome Comparing species without chromosomal maps becomes a question of comparing syntenic sets of genes while disregarding gene order or gene orientation Only interchromosomal events (translocation, fusion, and fission) affect synteny and can thus be deduced from synteny data

Motivation Using the synteny data of present-day organisms… What can we infer about synteny sets of their ancestors? How many chromosomes did these ancestors possess and what genes did they contain?

Problems Calculate syntenic edit distance between 2 genomes by inferring number of translocations, fusions, and fissions Use calculated distance to analyze the median problem for synteny i.e. find the genome with minimized sum of distances to three given genomes Optimize internal vertices of a given phylogenetic tree

Problem 1 Calculate syntenic edit distance between 2 genomes by inferring number of translocations, fusions, and fissions

Syntenic Distance Genome 1 Chromosome 1: {x,y} Chromosome 2: {p,q,r} Chromosome 3: {a,b,c} Genome 2 Chromosome 1: {p,q,x} Chromosome 2: {a,b,r,y,z} Compact Representation: {1,2}, {1,2,3}

Syntenic Distance Solution: Find the series of translocations, fusions, and fissions that transform Genome 2 into the k chromosomes of Genome 1 i.e. {1}, {2}, … , {k} {1,2}, {1,2,3} transformed by translocation to {1}, {2,3} {1}, {2,3} transformed by fission to {1}, {2}, {3} Distance = 2

Syntenic Distance for r(l)=1 Suppose l appears in r(l) chromosomes in Genome 2 If r(l)=1 and syntenic labels of l (l’) do not appear in any other chromosome, effect a fission to produce {l} as an individual chromosome If r(l)=1 and all labels l’ appear in r(l’)>rmin>1 chromosomes, effect a translocation to produce {l}

Example {1,2,3,4}, {2,3,5}, {2,3,4}, {4,5,6}, {4,8,9} Choose l=1 then, r(l)=1 rmin=3 l’=2 or l’=3 If {2,3,4} is the second chromosome in the translocation with {1,2,3,4} then we get, {1}, {2,3,4}, {2,3,5}, {4,5,6}, {4,8,9}

Syntenic Distance for r(l)>1 If r(l)>1, effect r(l)-1 fusions and one translocation to produce a separate {l} l l l l l

How do we know which l to choose? Any l for which r(l)=1 Any l for which r(l)=2 If all r(l)>2, choose l that minimizes r(l) and r(l’)

Simulations and Tests If the algorithm indeed yields the true minimum distance, then converting Genome 1 to Genome 2 should equal the distance from Genome 2 to Genome 1 65% identical in both directions 34% differed by 1 1% differed by 2 or more

Simulations and Tests Testing the application of syntenic distance to evolutionary history Generate random genomes by inducing a number of random translocations to {1}, …, {k} chromosomes When number of translocations < k/2, the algorithm yields the correct number of translocations, but as the number of translocations increase, the algorithm underestimates the true distance

Problem 2 Use calculated distance to analyze the median problem for synteny i.e. find the genome with minimized sum of distances to three given genomes

The Median Problem Let d(Genome 1, Genome 2) be the syntenic distance between Genome 1 and Genome 2 Median problem: given three genomes 1, 2, and 3, construct a genome S so that d(S,1) + d(S,2) + d(S,3) is minimized

Median Content Constraint (MCC) Genome S must contain certain genes present in all genomes 1, 2, and 3 OR two out of three genomes OR even in any of the three genomes Bottom-line: S cannot be empty, otherwise, the sum of the three distances is 0, and thus trivial MCC is a rather loose term for any particular context regarding calculating medians

The Median Problem Choose any gene to be in S according to the MCC. The initial chromosome in S contains this one gene. If there are unassigned genes that fulfill the MCC, they are added only if they do not increase the current cost. Otherwise, we assign genes based on whichever minimizes the sum of the distances to terminal nodes. Perform iterations that rearrange each gene into a different chromosome and compute the sum of the three distances until the minimum distance is reached.

Problem 3 Optimize internal vertices of a given phylogenetic tree

Optimizing a Given Phylogeny The most parsimonious solution will be such that each internal node and its three neighbours is a solution to the median problem.

Optimizing a Given Phylogeny MCC: ‘…include those genes in only one of the three genomes if they can be added after all the other genes are assigned chromosomes, in only one cost-free way.’

Limitations and Conclusions To find the most parsimonious tree we would have to compute all possible trees and their total syntenic distances (not computationally feasible at the time) But syntenic distance useful for comparing competing hypotheses

Conclusions

Thank you