Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Michal Ozery-Flato and Ron Shamir 2 The Genomic Sorting Problem HOW?

Similar presentations


Presentation on theme: "1 Michal Ozery-Flato and Ron Shamir 2 The Genomic Sorting Problem HOW?"— Presentation transcript:

1

2 1 Michal Ozery-Flato and Ron Shamir

3 2 The Genomic Sorting Problem HOW?

4 3 Overview Preliminaries Reduction to a simpler case The main algorithm (reduced case) Preliminaries Reduction to a simpler case The main algorithm (reduced case)

5 4 Genome Modeling +4+3+2+1 +3-2+1 +5 +7+6-4

6 5 Genome Modeling +3-2+1 +5 +7+6-4 -2-3-4 Chromosome flip 

7 6 Reciprocal Translocations Exchange non-empty ends between two chromosomes Prefix-prefix Prefix-postfix X1X2Y1Y2X1X2 Y1Y2 X1X2Y1Y2-Y1-X2

8 7 Sorting by Reciprocal Translocations Tails {(1, 2,-4), (-3, 5),(6,-8,-7,9)} = {1, 4, -3,-5, 6, -9 } A B: –genes(A) = genes(B) –Tails (A) = Tails(B) An O(n 3 ) algorithm (Hannenhalli 96, Bergeron et al. 06) reciprocal translocations

9 8 The Cycle Graph 4040 41411 1010 3131 3030 2121 2020 5050 5151 6060 6161 7171 7070 8080 8181 cycle graph(A,B) external internal adjacency #cycles(A,B) =3 A={(4, -1), (-3,-2, 5), (6,-7,8)} B={(1,2,3), (4,5), (6,7,8)}

10 9  A = (4, -1, -3,-2, 5, 6 -7,8) (concatenation of A’s chrs) The Overlap Graph (with Chromosomes) edge chromosome Overlap graph (A, B,  A ) ( 1,2 )( 4,5 )( 2,3 )( 6,7 )( 7,8 ) 4040 41411 1010 3131 3030 2121 2020 5050 5151 6060 6161 7171 7070 8080 8181

11 10 (Connected) Components Overlap graph (A, B,  A ) ( 1,2 )( 4,5 )( 2,3 )( 6,7 )( 7,8 ) bad component = non-trivial internal component trivial component = adjacency

12 11 Overview Preliminaries Reduction to a simpler case The main algorithm (reduced case)

13 12 The Reciprocal Translocation Distance d RT (A,B) = reciprocal translocation distance Theorem [Hannenhalli 96, Bergeron et al. 06] : d RT (A,B) = #genes - #chrs - #cycles(A,B) + F(A,B) –F(A,B) = depends on the topology of the bad components. If there are no bad components then F=0.

14 13 Reduced Case: No Bad Components Result 1: The problem “Sorting by Reciprocal Translocations” can be reduced to the problem “Sorting by Reciprocal Translocations, No Bad Components” in linear time.

15 14 Reduction’s Main Idea Isolation: all bad components are found in one chromosome. Goal: eliminate the bad components without creating –Maintain two lists of chromosomes: Exactly one minimal bad component Two or more minimal bad components –Use prefix-prefix translocations (no sign changes)

16 15 Overview Preliminaries Reduction to a simpler case The main algorithm (reduced case)

17 16 Translocations Defined by External Edges e = external edge  (e) = transforms e into an adjacency –Increases #cycles(A,B) –May create a bad component d RT (A,B) = #genes – #chrs – #cycles(A,B) +F(A,B) 1 2 e G y x 1 2 G  (e) e yx

18 17 The Main Algorithm 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 “Farward part” (S) “Backward part” (L) Solution

19 18 The Main Algorithm LSUnused edgesA  1,3,4,5(1,-5,6) (3,-4,2) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

20 19 The Main Algorithm LSUnused edgesA  1 3,4,5(3,-4,-5,6) (1,2) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

21 20 The Main Algorithm LSUnused edgesA 1  3,4,5(1,-5,6) (3,-4,2) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

22 21 The Main Algorithm LSUnused edgesA 143,53,5(3,6) (1,-5,-4,2) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

23 22 The Main Algorithm LSUnused edgesA 14,35(-2,6) (1,-5,-4,-3) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

24 23 The Main Algorithm LSUnused edgesA 14,3  (-2,6) (1,-5,-4,-3) 1.Mark all edges (except adjacencies) as “unused”, S , L  2.While there is an unused external edge e a.Mark e as “used” b.If  (e)   (FIRST(L)): Apply  (e) to A and APPEND (S, e) 3.If all the edges are used  return (S,L) 4.While all the unused edges are internal Undo last translocation and PREPEND(L, POP(S)) 5.Goto 1 B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i

25 24 Implementation of the Algorithm Simple O(n 2 ) time implementation time implementation using a data structure that: –Maintains a fragmented signed permutation –Allows one to find an external edge e and perform the translocation  (e) in time –Based on a data structure by Kaplan & Verbin 05'

26 25 Thank You !

27 26 Simulating Translocations by Reversals [Hannenhalli & Pevzner] A translocation can be simulated by: A reversal on  A, or A chromosome flip in  A + a reversal on  A 10101 2020 2121 3030 3131 4040 4141 5050 5151 cycle graph(A,B) 10101 4141 4040 3131 3030 2121 2020 5050 5151

28 27 Working on the overlap graph H = overlap graph(A, B,  A ) H is sorted if every component is trivial Operations: –  (v) : a reversal on an oriented external vertex v (cost = 1) –  (X) : a flip on chromosome X (cost = 0)

29 28 H●  (v) (two chromosome only) unoriented edge oriented edge chromosome H v unoriented edge oriented edge chromosome H●  (v) v unoriented edge oriented edge chromosome H v

30 29 H●  (X) unoriented edge oriented edge chromosome H X unoriented edge oriented edge chromosome H●  (X) X unoriented edge oriented edge chromosome H X


Download ppt "1 Michal Ozery-Flato and Ron Shamir 2 The Genomic Sorting Problem HOW?"

Similar presentations


Ads by Google