Presentation is loading. Please wait.

Presentation is loading. Please wait.

Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski.

Similar presentations


Presentation on theme: "Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski."— Presentation transcript:

1 Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski

2 Real Scaling Given an n x n Text T, m x m pattern P, find all occurrences of P in T, scaled to any read scale Given an n x n Text T, m x m pattern P, find all occurrences of P in T, scaled to any read scale Best known algorithm [Amir at el.]: Best known algorithm [Amir at el.]: Time: O(nm 3 +n 2 m*log(m)) Space: O(nm 3 +n 2 ) Time: O(nm 3 +n 2 m*log(m)) Space: O(nm 3 +n 2 ) Our Altorithm: Our Altorithm: Time: O(n 2 m) Space: O(n 2 ) Time: O(n 2 m) Space: O(n 2 )

3 Scaling – Geometric Definition

4 Scaling – Algebraic Definition Rounding Function: Rounding Function:

5 Scaling – Algebraic Definition Given pattern P, of size m x m, and scale r Given pattern P, of size m x m, and scale r The first row would be scaled to || 1*r || The first row would be scaled to || 1*r || The first 2 rows would be scaled to || 2*r || The first 2 rows would be scaled to || 2*r || … The first m rows would be scaled to || m*r || The first m rows would be scaled to || m*r || Similarly on the columns Similarly on the columns

6 Scaling – Algebraic Definition Rounding Function: Rounding Function: Inverse Rounding Function: suppose we know that K rows where scaled to L row: Inverse Rounding Function: suppose we know that K rows where scaled to L row:

7 Subrow/column Repetition Query Query time: O(1), preprocessing time: O(n 2 )

8 Algorithm Layout The algorithm consists of 4 stages: 1. Scale Elimination 2. Candidate Consistency 3. Candidate Verification 4. Occurrence Recognition Each stage takes O(n 2 m) time and O(n 2 ) space

9 Scale Elimination Stage Pivot

10 (i,j)

11 (i,j) O(m) time for each location, O(n 2 m) total, O(n 2 ) space

12 Candidate Consistency Stage

13 Case (a) Case (b)

14 Witness Table Construction For each suffix O(m 2 ) time and O(m) space

15 Pre-Dueling Step For each candidate c in T: For each suffix s of P: Compare c ’ s borders with witness table borders of suffix s If borders are not the same – c is eliminated Can be done in O(m) time for each candidate

16 Performing a Duel

17 The Dueling Order Each candidate performs at most O(m) succ. duels

18 Witness Table construction: O(m 3 ) time, O(m 2 ) space O(m 3 ) time, O(m 2 ) space Pre-Dueling Step: O(n 2 m) time, O(m 2 ) space O(n 2 m) time, O(m 2 ) space # of Duel At most O(n) unsucc., at most O(n 2 m) succ. At most O(n) unsucc., at most O(n 2 m) succ. where each duel takes O(1) time Total: O(n 2 m) time, O(n 2 ) space Candidate Consistency Stage

19 Candidate Verification Stage

20 For each location find maximal containing interval Can be solved in O(n) time per row using solution to Maximal Interval Problem

21 Once we find the largest interval we: Verify each row in O(m) time, using subcolumn repetition queries Verify each row in O(m) time, using subcolumn repetition queries Save the longest matching length Save the longest matching length For each candidate run a Range Minimum Query on the lengths For each candidate run a Range Minimum Query on the lengths The pattern appears iff pattern size >= RMQ Candidate Verification Stage

22 Finding largest intervals: O(n) time per row, O(n 2 ) total O(n) time per row, O(n 2 ) total Verifing columns: O(nm) time per row, O(n 2 m) total O(nm) time per row, O(n 2 m) total RMQ : Preprocess: O(n) time per row, O(n 2 ) total Preprocess: O(n) time per row, O(n 2 ) total Quering: O(1) time per candidate, O(n 2 ) total Quering: O(1) time per candidate, O(n 2 ) total Total: O(n 2 m) time, O(n 2 ) space Candidate Verification Stage

23 Occurrence Recognition Stage Recall: Scale elimination stage returned At most O(m) steps per candiate Total: O(n 2 m) time

24 Conclusions The algorithm consists of 4 stages: 1. Scale Elimination 2. Candidate Consistency 3. Candidate Verification 4. Occurrence Recognition Each stage takes O(n 2 m) time and O(n 2 ) space


Download ppt "Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski."

Similar presentations


Ads by Google