Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette.

Similar presentations


Presentation on theme: "Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette."— Presentation transcript:

1 Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette 2 IBM T. J. Watson Research Center

2 Phylogeny Reconstruction of the evolutionary relationship of a collection of organisms, usually in the form of a tree.

3 Phylogenetic data Behavioral, morphological, metabolic, etc. Molecular data: sequence data, gene-order data etc. gene-order data

4 Why gene order data? Low error rate. Rare evolutionary events unlikely to cause “silent" changes; can help inferring millions of years.

5 Genomes rearrangements Inverted Transposition 1 2 3 9 -8 –7 –6 –5 –4 10 Inversion 1 2 3 –8 –7 –6 –5 -4 9 10 Transposition 1 2 3 9 4 5 6 7 8 10 1 2 3 4 5 6 7 8 9 10

6 Breakpoint distance  Breakpoints are number of adjacencies present in one genome, but not in the other. 1 2 3 4 5 6 7 8 9 10 1 –3 –2 4 5 9 6 7 8 10 For some datasets, a close-to-linear relationship between the breakpoints and evolutionary events may exist. Can be used for building phylogeny (Blanchette et al.).

7 Limitations of breakpoint The number of breakpoints created by a certain number of inversions may vary. Also, transpositions generally create more breakpoints than inversions. Computing the breakpoint phylogeny is NP-hard.

8 MPBE (Maximum Parsimony on Binary Encoding) A heuristic for the breakpoint phylogeny (Cosner et al. ). All ordered pairs of signed genes appearing consecutively are coded as binary features. Exponential time complexity, however, much faster than BPAnalysis.

9 Limitations May fail to find feasible solutions to the breakpoint phylogeny problem.

10 Observation: The closer is the evolution history, the more permutations (of different granularity) are in common 1 2 3 4 5 6 7 8 9 10 1 2 3 –8 –7 –6 –5 –4 9 10 1 8 –3 –2 –7 –6 –5 –4 9 10

11 Maximal pi-pattern (Eres et al.) Matches permutations at different granularity. Polynomial time complexity.

12 pi-pattern Example : For S = and k=2 All pi-patterns are: ac, bc, abc, abcc acbcabacbcab abc Pattern with minimum k permutations

13 Cover P1 covers P2=> Every P1 has a P2 Every P2 is within a P1  Example In S = acbcab abc covers ac

14 Maximal pi-pattern pi-pattern which is not covered  Example In S = acbcab pi-patterns: ac, bc, abc, abcc Maximal pi-patterns: abc, abcc not covered by abcc

15 Results

16 Phylogeny for simulated evolution on synthetic data

17 12 genera of Campanulaceae and the outgroup tobacco

18 Tree1: MPBE tree

19 Tree2: Neighbor joining tree (using few different distances)

20 Tree3: Neighbor joining tree using permutation patterns  167 Maximal pi-patterns(from 10769 pi-patterns) used as binary feature  XOR Distance measure  Distance/Similarity matrix is created to find neighbor joining tree

21 Tree3 vs Tree2

22 Conclusion Permutation patterns may preserve more evolutionary information. Evolutionary events could be counted within permuted segments to develop a hybrid scheme. Current approaches remain unable to handle unequal gene content, which could be solved using maximal pi-patterns.


Download ppt "Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette."

Similar presentations


Ads by Google