Presentation is loading. Please wait.

Presentation is loading. Please wait.

S C A L E D Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan University.

Similar presentations


Presentation on theme: "S C A L E D Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan University."— Presentation transcript:

1 S C A L E D Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan University

2 Motivation Searching for Templates in Aerial Photographs Input: Aerial photo Template Task: Search for all locations where the template appears in the image.

3

4 Model Low level (pixel level) avoid costly processing Asymptotically efficient solutions. Serial, exact algorithms.

5 Types of Approximations Local errors: Level of detail Occlusion Noise results: O(n² log m) mismatches O(n²k²( edit distance, k errors, rectangular patterns. O(n²k√(m log m) √(k log k) edit distance, k errors, half rectangular patterns AL-88 AF-95

6 Types of Approximation Orientation. results: O(n²m ) FU-98 O(n²m³) ACL-98 Scaling: Natural scales: results: O(n) 1-d EV-88 O(n² log |Σ|) 2-d ALV-92 O(n²) dictionary AC-96 Real scales: this result: O(n) 1-d, truncation 5

7 It seems daunting, but…

8 CPM 2003: Morelia, Mexico

9 Problem inherently inexact What if occurrence is 1½ times bigger? What is the meaning of “½ a pixel”? Solutions until now: Natural Scales - Consider only discrete scales: 1, 2, 3, 4, 5,...

10 Definition: Text: Pattern: Find all occurrences of the pattern in the text in all discrete sizes. m m n n

11 Discrete exact Scaled Matching T P A A A A A A A A A A A A A A A A A A C C A A C A A A A C C A A A A A C C A A A A A A A C C A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A C C A A A A A C C C A A A A A A A A A A C C C A A A A C A A A A A A A A A A A A A A A A A A A A A A A A C C A C A A A A A A A A A A A A A

12 Discrete exact Scaled Matching P Z U Y K V S X E T P³ Z Z Z U U U Y Y Y K K K V V V S S S X X X E E E T T T

13 Idea: Fix a scale s Constant amount of work for each square (s-block) s s n n/s

14 Algorithm time Time for scale s: Total time: converges to a constant Making the total time O(n²)

15 Problem: Real scales Was open even for strings … How do we define? aabcccbb Scaled to 2: aaaabbccccccbbbb Scaled to 1½: aaab cccc bbb truncate truncate ½b ½c

16 Formally: Denote: a aaa... a Problem Definition 1: Input: Pattern Text: Output: All text locations where appears for some r times r

17 Remark α ≥ 1 means we only scale “up” Reasons: Avoid conceptual problem of loss of resolution. From “far enough” away everything looks the same. By our definition, for k<1/m there is a match at every text location.

18 Simplify definition Definition 2: Look for in the text. Example: P=aabcccbbbb Match by definition 2: daaabccccbbbbbbe Match by definition 1 but not by def 2: daaaabccccbbbbbbbe

19 Why are definitions equivalent? Split text and pattern to symbol part T s, P s and length part T L, P L. Example: P= aabcccbbbb P s =abcb P L =2134 T=daaabccccbbbbbbe T s =dabcbe T L =131461

20 Time Time for split: O(n+m) Finding P s in T s : O(n+m) (e.g. KMP) HARD PART: Finding P L in T L.

21 Definitions are Equivalent Claim: Solving def 2 in time O(f(n)) Solving def 1 in time O(f(n)). Why? - Find in time O(f(n)) - For each match verify 1 st and last symbol in constant time in T s and T L. Total time: O(f(n)+n)=O(f(n)).

22 Na ï ve algorithm for matching P L in T L For each text location, position pattern starting at that location and calculate interval [t/p, (t+1)/p) for each resulting pair. This is the interval of possible scales since t/p·p = t for every α < t/p, |αp| < t (t+1)/p ·p = t+1 for every α ≥ t/p, |αp| > t

23 Check intersection If intersection of all intervals is not empty then there is a match. Time: O(nm) Example: P L : 2 1 2 3 2 T L : 2 4 2 4 7 4 5 3 [1,3/2) [4,5) The intersection is empty thus no scaled match in location 1. But…

24 Check intersection If intersection of all intervals is not empty then there is a match. Time: O(nm) Example: P L : 2 1 2 3 2 T L : 2 4 2 4 7 4 5 3 [2,5/2) [2,3) [2,5/2)[7/3,8/3)[2,5/2) The intersection is [7/3,5/2) thus there is a scaled match in location 2.

25 Improvement – Parameterized Matching Introduced: Baker 1994. Motivation: “copying” code.

26 Parameterized Matching Input: two strings s and t |s|=|t|, over alphabets ∑ s and ∑ t. s parameterize matches t: if bijection : ∑ s ∑ t, such that (s) = t. (a)=x (b)=y aa b bb xx yyy Example:

27 Parameterized Matching Claim (AFM-94): For Σ that can be sorted in linear time (e.g. Σ={1,..., n}) Parameterized matching can be done in time O(n).

28 The reduction Lemma: for which P L matches T L at location i scaled to α only if P L p-matches T L at i. Proof: Assume P L does not p-match T L at location i. The possible situations are:

29 Possibility 1 w.l.o.g. c ≥ a+1 For c = a+1 (smallest possible): TLTL PLPL a bb c≠a

30 Possibility 2 w.l.o.g. c ≥ b+1 Intersection not empty only if: (a+1)/(b+1) > a/b i.e. ab+b > ab+a b>a But this can never happen if α ≥ 1. TLTL PLPL a bc≠b a

31 Algorithm for Real Scaled String Matching Let { P i 1, P i 2,..., P i j } be the different numbers in P L. 1.P-match P L in T L. 2.For each match, chack intersection of intervals between P i 1,..., P i j and corresponding symbols in T L. End Algorithm

32 P L = 2 3 2 3 2 P i 1 =2 P i 2 =3 p-matches T L = 5 6 5 6 5 6 10 6 10 6 10 7 scaled match Example:

33 Important Fact: So there are at most O(√m) different P i k ’s. Time: O(n) for parameterized matching (Σ={1,2,…,n}). O(√m) verification for each location. Total: O(n√m).

34 Tighter analysis Upper bound number of possible p-matches. Lemma: Let |P|=m, |T|=n, { P i 1, P i 2,..., P i j } be the different numbers in P L. Then there are at most n/2j p-matches of P L in T L. Meaning: Since verification time is O(j) per p-match, the lemma implies that total verification time is: O((n/2j) · j) = O(n)

35 Proof of Lemma: 1 st appearance of P i 1,..., P i j P L P i 1 P i 2 P i j T L a 1 a 2 a j m-match

36 Lemma’s proof (cont.) Let x be the total number of p-matches in the text. The sum of all text elements that match 1 st occurrences of P i k ‘s in the pattern ≥ (xj²)/2 But: There are overlaps! How many?

37 Lemma’s proof (cont.) For each text location, at most j matches will count it. Therefore… Total count without overlaps ≥ Clearly: x·j/2 ≤ n thus x ≤ (2n)/j

38 Open Problem: Give 1-d algorithm linear in run-length compressed text and pattern.


Download ppt "S C A L E D Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan University."

Similar presentations


Ads by Google