Presentation on theme: "A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute."— Presentation transcript:
A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute
Genome Rearrangements During evolution, genomes undergo large- scale mutations which change gene order (reversals, transpositions, translocations). Given 2 genomes, GR algs infer the most economical sequence of rearrangement events which transform one genome into the other.
Genome Rearrangements Model Chromosomes are viewed as ordered lists of genes. Unichromosomal genome, every gene appears once. Genomes are represented by unsigned permutations fo genes. Circular genomes (e.g., bacteria & mitochondria) are represented by circular perms.
Sorting by Transpositions A transposition exchanges between 2 consecutive segments of a perm. Example : 1 2 3 4 5 6 7 8 9 1 2 6 7 3 4 5 8 9 Sorting by transpositions : finding a shortest sequence of transpositions which sorts the perm.
Previous work 1.5-approximation algs for sorting by transpositions [BafnaPevzner98, Christie99]. An alg that sorts every perm of size n in at most 2n/3 transpositions [Erikkson et al 01]. Complexity of the problem is still open.
Main Results 1. The problem of sorting circular permutations by transpositions is equivalent to sorting linear perms by transpositions. 2. A new and simple 1.5-approximation alg for sorting by transpositions, which runs in quadratic time.
Linear & Circular Perms A B A C t BADCDBCA t B C Linear transposition : Circular transposition : Circular transpositions can be represented by exchanging any 2 of the 3 segments. A transposition “cuts” the perm at 3 points.
Linear & Circular Equivalence Thm : Sorting linear perms by transpositions is computationally equivalent to sorting circular perms. Pf sketch: Circularize linear perm by adding an n+1 element and closing the circle. П n+1 ПnПn П1П1 П 1... П n..... Every linear transposition is equivalent to a circular transposition that exchanges the 2 segments that do not include n+1.
Breakpoint Graph (Cont.) Max # of odd cycles, n, is in the id perm, thus: Lower bound [BP98]: For all , d( ) [n-c odd ( )]/2. Goal : increase # of odd cycles in G. t is a k-transposition if Δc odd ( ,t) = k. A cycle that admits a 2-transposition is oriented.
Simple Permutations A perm is simple if its breakpoint graph contains only short ( 3) cycles. The theory is much simpler for simple perms. Thm : Every perm can be transformed into a simple one, while maintaining the lower bound. Moreover, the sorting sequence can be mimicked. Corr : We can focus only on simple perms.
3 - Cycles 2 possible configurations of 3-cycles: Non-oriented 3-cycleOriented 3-cycle
(0,2,2)-Sequence of Transpositions A (0,2,2)-sequence is a sequence of 3 transpositions: the 1 st is a 0-transposition and the next two are 2-transpositions. A series of (0,2,2)-sequences preserves a 1.5 approximation ratio. Throughout the alg, we show that there is always a 2-transposition or a (0,2,2)- sequence.
Interleaving Cycles 2 cycles interleave if their black edges appear alternatively along the circle. Lemma : If G contains 2 interleaving 3-cycles, then a (0,2,2)-sequence.
Shattered Cycles Lemma : If G contains a shattered cycle, then a (0,2,2)-sequence. 2 pairs of black edges intersect if they appear alternatively along the circle. Cycle A is shattered by cycles B and C if every pair of black edges in A intersects with a pair in B or with a pair in C.
Shattered Cycles (Cont.) Lemma : If G contains no 2-cycles, no oriented cycles and no interleaving cycles, then a shattered cycle.
The Algorithm While G contains a 2-cycle, apply a 2-transposition [Christie99]. If G contains an oriented 3-cycle, apply a 2- transposition on it. If G contains a pair of interleaving 3-cycles, apply a (0,2,2)-sequence. If G contains a shattered unoriented 3-cycle, apply a (0,2,2)-sequence. Repeat until perm is sorted.
Conclusions We introduced 2 new ideas which simplify the theory and the alg: 1. Working with circular perms simplifies the case analysis. 2. Simple perms avoid the complication of dealing with long cycles (similarly to the HP theory for sorting by reversals).
Open Problems Complexity of sorting by transpositions. Models which allow several rearrangement operations, such as trans-reversals, reversals and translocations (both signed & unsigned).