Download presentation

Presentation is loading. Please wait.

Published byAbbie Blong Modified over 3 years ago

1
Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith

2
Outline Introduction Preliminaries Linear-Time solution for constant d Related Problems Linear-Time solution for fixed k Conclusion

3
Intro : Problem Definition Input: String s 1, s 2, …, s k over alphabet Σ of length L each, and a nonnegative integer d. Question: Is there a string s of length L such that d H (s, s i )≤d for all i=1,…,k d H (s 1, s 2 ) = |{i|s 1 [i]≠s 2 [i]}|, |s 1 |=|s 2 |

4
NP-completeness CLOSEST STRING is NP-complete d is usually small in biological applications O(kL+kd*d d ) result in this paper PTAS by Li et al

5
Extended problems d-MISMATCH DISTINGUISHING STRING SELECTION DISTINGUISHING SUBSTRING SELECTION

6
Preliminaries Given a set of string S={s 1, …,s k }, each of length L s is optimal center string iff no s ’ such that max i=1, …,k d H (s ’,s i )

7
Given a set of k strings of length L, think of this string as k x L matrix Optimal median string : a c c a s1abcd s2aadb s3bcda s4accc

8
Main idea Search! Fixed-parameter tractibility Reduction to problem kernel

9
LEMMA 1. Given a set of strings S={s 1, …,s k }, each of length L, and a permutationσ:{1,…,L} {1,…,L}. Then s is an optimal center string for {s 1,…,s k } iff σ(s) is an optimal center string for {σ(s 1 ), σ(s 2 ), …, σ(s k )}

10
LEMMA 2. To compute an optimal center string, it is sufficient to solve a normalized and reordered instance. From this, the solution of the original instance can be derived in linear time s1abcd s2aadb s3bcda s4accc s1abaa s2acbb s3babc s4aaad s1baaa s2cabb s3abbc s4aaad

11
LEMMA 3. A CLOSEST STRING instance with arbitrary alphabet Σ, |Σ|>k, isomorphic to a CLOSEST STRING instance with alphabet Σ’, |Σ’|=k. By normalization

12
LEMMA 4. Given a CLOSTEST STRING instance s 1, …,s k of length L and d. If the resulting k x L matrix has more than kd dirty dirty columns, then there is no string s with max i=1, …,k d H (s,s i )≤d A column is dirty iff it contains at least two different symbols from alphabet Σ By pigeon theorem

13
A Linear-Time solution for constant d Bounded search tree algorithm LEMMA 5. Given a set of strings S={s 1, …,s k } and a positive integer d. If there are i, j {1, …,k} with d H {s i,s j }>2d, then there is no string s with max i=1, …,k d H (s, s i )≤d

15
Theorem 1. Given a set of string S={s 1, …,s k } and d, Algorithm D determines in O(kL+kd*d d ) time. By lemma 4, reduced the input instance to O(kd) in O(kL) time Depth=d, Time(D0+D1+D2+D3)=kd by building a table containing the distances of candidate s 1 to all other given strings

16
correctness Show only the correctness of first step If s 1 is not a solution but there exists a center string s P :={p|s 1 [p]≠s i [p]}, |P|=d+1 P s1≠s=s i := {p|s 1 [p]≠s[p]=s i [p]} goal! P s1≠s=si =P s≠si ∪ P (disjoint), |P s≠si |≤d So d+1 subcases is sufficient

17
Related Problems d-MISMATCH problem S i,p,L denote the length L substring of a given string s i starting at position p Whether there is a string of length L and a position p with 1≤p≤n-L+1, such that d H (s,s i,p,L )≤d, for all I Stojanvoic et al give a linear time algorithm fo 1-MISMATCH Theorem 2. d-MISMATCH is solvable in O(kL+(n- L)kd*d d ) time which O(n*k) for fixed d Naively: O(n*(KL+kd*d d )) Maintain the queue of dirty columns Considering only the first L columns, we can build a FIFO queue in O(kL) Update at each position in O(k) time

18
DSS problem DISTINGUISHING STRING SELECTION Given S={s 1, …,s k1 }, S ’ ={s ’ 1, …,s ’ k2 } all of the same length L, and d 1,d 2 ≥0, is there a s such that LEMMA 6. Given two set of strings S 1 ={s 1,…,s k1 } and S 2 ={s’ 1,…,s’ k2 } and positive d1,d2. If there are i {1, …,k 1 } and j {1, … k 2 } with d H (s i,s ’ j )

19
A Linear-Time Solution for Fixed k Is CLOSEST STRING fixed parameter tractable? Use integer linear programming (ILP) Lenstra: ILP with a fixed number of variables can be solved in linear time(exponential space)

20
CLOSEST STRING in ILP Column types for k For k=3: (a,a,a) t, (a,a,b) t, (a,b,a) t, (b,a,a) t, (a,b,c) t |column types|=B(k)≤k! X t,φ, t: column type, φ Σ Number of column type t whose corresponding character in the desired solution string of CLOSEST STRING is set to φ B(k)*k Variables needed Minimize Φ t,i denates the alphabet symbol at the i th entry of column type t

21
Conclusion Fixed parameter tractability for CLOSEST STRING in d, k Improve previous work in d-MISMATCH DSS CLOSEST SUBSTRING ?

Similar presentations

Presentation is loading. Please wait....

OK

2010/5/281 資料結構與演算法 ( 下 ) 呂學一 (Hsueh-I Lu)

2010/5/281 資料結構與演算法 ( 下 ) 呂學一 (Hsueh-I Lu)

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on electron spin resonance spectroscopy Ppt on our planet earth Ppt on obstructive jaundice Best ppt on social networking sites Download ppt on zener diode Ppt on current trends in hrm in health Ppt on wireless integrated network sensors Training ppt on email etiquettes Ppt on ufo and aliens contact Ppt on cloud services