Presentation is loading. Please wait.

Presentation is loading. Please wait.

String matching.

Similar presentations


Presentation on theme: "String matching."— Presentation transcript:

1 String matching

2 Exact String Matching Input: Two strings T[1…n] and P[1…m], containing symbols from alphabet . Example:  = {A,C,G,T} T[1…12] = “CAGTACATCGAT” P[1..3] = “AGT” Goal: find all “shifts” 0≤s ≤n-m such that T[s+1…s+m] = P

3 Simple Algorithm for s ← 0 to n-m Match ← 1 for j ← 1 to m
if T[s+j]≠P[j] then Match ← 0 exit loop if Match=1 then output s

4 Analysis Running time of the simple algorithm: Worst-case: O(nm)
Average-case (random text): O(n) (expectation) Ts = time spend on checking shift s (the number of comparisons until 1st mismatch) E[Ts] < 2 (why) E[SsTs] = SsE[Ts] = O(n)

5 Approximate String Matching
Input: Two strings T[1…n] and P[1…m], containing symbols from alphabet . Goal: find all “shifts” 0≤s ≤n-m such that T[s+1…s+m] is “highly similar” to P

6 Two common metrics for comparing strings
Given two strings T[1…n] and P[1…m]: Hamming distance: the number of substitutions between the two strings. n=m Edit distance: the number of edit operations (including substitutions, insertions, and deletions) to transform one string to the other string.

7 Simple Algorithm for Hamming Distance
for s ← 0 to n-m Mismatch ← 0 for j ← 1 to m if T[s+j]≠P[j] then Mismatch ← Mismatch+1 If Mismatch > threshold exit loop if Mismatch<=threshold then output s


Download ppt "String matching."

Similar presentations


Ads by Google