Download presentation

Presentation is loading. Please wait.

Published byDeonte Summerhays Modified over 2 years ago

1
Longest Common Rigid Subsequence Bin Ma and Kaizhong Zhang Department of Computer Science University of Western Ontario Ontario, Canada.

2
(Rigid) Subsequence Subsequence: COMBINATORIALPATTERNMATCHING CPM Rigid Subsequence: 0123456789012345678901234567 COMBINATORIALPATTERNMATCHING CPM, (13,7)

3
Common (Rigid) Subsequence Longest Common Subsequence (LCS) –combinatorial pattern matching –longest common rigid subsequence comnienc Longest Common Rigid Subsequence (LCRS) – combinatorial pattern matching –longest common rigid subsequence comni,(1,1,3,5)

4
Previous Results LCS and LCRS of two strings: –polynomial time solvable LCS of many strings: –Cannot be approximated within ratio in polynomial time (Jiang and Li 1995, SIAM J COMP). –For random instances, a simple greedy algorithm can give an almost optimal solution with only small error. LCRS of many strings: –Exponential time algorithms. –Our CPM paper tries to answer the time complexity.

5
Motivation in Bioinformatics In biochemistry, a motif is a recurring pattern in DNA/protein sequences. A protein motif (SH3 domain binding motif) in J. Biological Chemistry 269:24034-9. Many motifs can be found at PROSITE database of ExPASy.

6
Motivation Rigoutsos and Floratos proposed the following problem (Bioinformatics 14:55-67,1998). –Given n strings and a positive number K, find a longest “rigid pattern” (rigid subsequence) that occurs in at least K of the n strings. When K=n, it is LCRS. Exponential time algorithms were studied. NP-hardness unknown.

7
Our Results LCRS is MAX-SNP hard –Therefore, Rigoutsos and Floratos’ problem is also MAX-SNP hard. For random instances, there is an algorithm solves LCRS with quasi-polynomial average running time. –The algorithm also works for Rigoutsos and Floratos’ problem with simple modifications.

8
MAX-SNP hard L-reduction from Max-Cut vertex edge delimiter

9
The construction of each edge aaa aba bab contributes 0 aaa aba bab contributes 1 aaa aba bab contributes 1 Three possible configurations in an ungapped alignment

10
The Algorithm Let S i be the set of length-i common rigid subsequences. We only need to prove that

11
Sketch of Proof For each rigid subsequence in S i, the probability it occurs in one random string of length n The prob. that it occurs in every input string There are in total length i rigid subsequences. This can be done by two cases i 2 logn.

12
Acknowledgement Supported by NSERC, PREA and CRC.

Similar presentations

OK

Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.

Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on different layers of atmosphere Ppt on international business management Ppt on needle stick injury report Ppt on pricing policy in business Ppt on self development definition Ppt on index numbers formula Disaster management ppt on floods Ppt on obesity management program Ppt on save environment poster Ppt on conservation of environmental resources