2-Dimensional Pattern Matching

Slides:



Advertisements
Similar presentations
1 Faster algorithms for string matching with k mismatches Adviser : R. C. T. Lee Speaker: C. C. Yen Journal of Algorithms, Volume 50, Issue 2, February.
Advertisements

Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.
Parametrized Matching Amir, Farach, Muthukrishnan Orgad Keller.
Combinatorial Pattern Matching CS 466 Saurabh Sinha.
Applied Algorithmics - week7
Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.
Space-for-Time Tradeoffs
Greedy Algorithms Amihood Amir Bar-Ilan University.
Bar Ilan University And Georgia Tech Artistic Consultant: Aviya Amir.
Two implementation issues Alphabet size Generalizing to multiple strings.
Combinatorial Pattern Matching CS 466 Saurabh Sinha.
Pattern Matching Algorithms: An Overview Shoshana Neuburger The Graduate Center, CUNY 9/15/2009.
1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Goodrich, Tamassia String Processing1 Pattern Matching.
Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.
Property Matching and Weighted Matching Amihood Amir, Eran Chencinski, Costas Iliopoulos, Tsvi Kopelowitz and Hui Zhang.
Dynamic Text and Static Pattern Matching Amihood Amir Gad M. Landau Moshe Lewenstein Dina Sokol Bar-Ilan University.
Sequence Alignment Variations Computing alignments using only O(m) space rather than O(mn) space. Computing alignments with bounded difference Exclusion.
Aho-Corasick String Matching An Efficient String Matching.
Faster Algorithm for String Matching with k Mismatches Amihood Amir, Moshe Lewenstin, Ely Porat Journal of Algorithms, Vol. 50, 2004, pp Date.
Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
S C A L E D Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan University.
1 Exact Set Matching Charles Yan Exact Set Matching Goal: To find all occurrences in text T of any pattern in a set of patterns P={p 1,p 2,…,p.
Motif finding: Lecture 1 CS 498 CXZ. From DNA to Protein: In words 1.DNA = nucleotide sequence Alphabet size = 4 (A,C,G,T) 2.DNA  mRNA (single stranded)
1 Amihood Amir Bar-Ilan University and Georgia Tech UWSL 2006.
String Matching. Problem is to find if a pattern P[1..m] occurs within text T[1..n] Simple solution: Naïve String Matching –Match each position in the.
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
Chapter 7 Space and Time Tradeoffs James Gain & Sonia Berman
Advanced Algorithm Design and Analysis (Lecture 3) SW5 fall 2004 Simonas Šaltenis E1-215b
Geometric Matching on Sequential Data Veli Mäkinen AG Genominformatik Technical Fakultät Bielefeld Universität.
Multiple Pattern Matching in LZW Compressed Text Takuya KIDA Masayuki TAKEDA Ayumi SHINOHARA Masamichi MIYAZAKI Setsuo ARIKAWA Department of Informatics.
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
MCS 101: Algorithms Instructor Neelima Gupta
An Implementation of The Teiresias Algorithm Na Zhao Chengjun Zhan.
Faster Algorithm for String Matching with k Mismatches (II) Amihood Amir, Moshe Lewenstin, Ely Porat Journal of Algorithms, Vol. 50, 2004, pp
MCS 101: Algorithms Instructor Neelima Gupta
Design and Analysis of Algorithms - Chapter 71 Space-time tradeoffs For many problems some extra space really pays off: b extra space in tables (breathing.
ADVANTAGE of GENERATOR MATRIX:
Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.
String Matching String Matching Problem We introduce a general framework which is suitable to capture an essence of compressed pattern matching according.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Instructor: Laura Kallmeyer
Programming Languages Translator
OTHER MODELS OF TURING MACHINES
Advanced Algorithm Design and Analysis (Lecture 12)
13 Text Processing Hongfei Yan June 1, 2016.
Space-for-time tradeoffs
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2002 Bin Sort, Radix.
Top-Down Parsing Identify a leftmost derivation for an input string
Linear Equations in Linear Algebra
Comparative RNA Structural Analysis
Fundamentals of Data Representation
Pattern Matching 1/14/2019 8:30 AM Pattern Matching Pattern Matching.
KMP String Matching Donald Knuth Jim H. Morris Vaughan Pratt 1997.
Space-for-time tradeoffs
CSE 589 Applied Algorithms Spring 1999
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2001 Bin Sort, Radix.
Tries 2/23/2019 8:29 AM Tries 2/23/2019 8:29 AM Tries.
Space-for-time tradeoffs
Tries 2/27/2019 5:37 PM Tries Tries.
Pattern Matching Pattern Matching 5/1/2019 3:53 PM Spring 2007
Space-for-time tradeoffs
Recap lecture 20 Recap Theorem, Example, Finite Automaton with output, Moore machine, Examples.
Recap lecture 40 Recap of example of PDA corresponding to CFG, CFG corresponding to PDA. Theorem, HERE state, Definition of Conversion form, different.
Finite Automaton with output
Linear Equations in Linear Algebra
Error Correction Coding
Presentation transcript:

2-Dimensional Pattern Matching Amihood Amir, Dina Sokol, Shoshana Neuburger UWSL 2006

2-Dimensional Pattern Matching Perform pattern matching on images MRI FAX

Searching Aerial Photographs

Historic Two Dimensional Model:

2D Pattern Matching - Example Input: = {A,B} Pattern: Text Output: { (1,4),(2,2),(4, 3)} A B A B A B A B A B

Bird-Baker Algorithm (1976) Time: for bounded fixed alphabets. for infinite Technique: linearization.

Bird / Baker First linear-time 2D pattern matching algorithm. View each pattern row as a metacharacter to linearize problem. Convert 2D pattern matching to 1D.

Linearization Concatenate rows of Text and use string matching tools. In this case – The Aho and Corasick algorithm for a dictionary of patterns. The dictionary consists of all pattern rows.

Find all pattern rows… then align them.

Bird / Baker Preprocess pattern: Name rows of pattern using AC automaton. Using names, pattern has 1D representation. Construct KMP automaton of pattern. Identical rows receive identical names.

Bird / Baker - Example Preprocess pattern: Name rows of pattern using AC automaton. Using names, pattern has 1D representation. Construct KMP automaton of pattern. A B 1 2

Bird / Baker Scan text: Name positions of text that match a row of pattern, using AC automaton within each row. Run KMP on named columns of text. Since the 1D names are unique, only one name can be given to a text location.

Bird / Baker - Example Scan text: Name positions of text that match a row of pattern, using AC automaton within each row. Run KMP on named columns of text. A B 2 1 2 1

Another linearization- pad with “don’t cares” m n-m Time: Fischer-Paterson (1972)

Witnesses Popular paradigm in pattern matching: find consistent candidates verify candidates consistent candidates → verification is linear

Dueling Algorithm

Data Structure List of potential candidates R = rightmost element of that list N = new element R N

Case 1: N dies X R N N

Case 2: R dies X R N

Case 3: noone dies add N to list of consistent candidates Since N is consistent with R, and R is consistent with the rest of the list, by transitivity, N is consistent with the list

Witnesses Vishkin introduced the duel to choose between two candidates by checking the value of a witness. Alphabet-independent method.

Dueling Paradigm [Vishkin 1985] T= witness i j ? b a A duel chooses between two possible candidates by checking the value of a ‘witness.’

Witness Table P T Witness table A witness table is a table of size |P|, which stores a location of a conflict for each location of P (w/r to left cand). P Witness table T i j

Dueling Method in 2D How do we arrange for candidates to agree on overlap? – duel! When there is conflict between two candidates, a single text check eliminates at least one candidate. The text location can be pre-computed because of transitivity. The dueling phase is thus linear time. A A A A A A A A A A A V A A A A A A A A A A A A A V A A

A duel in 2-dimensions Witness[3,3]=(4,3) 1 2 3 4 a b 1 2 3 4 a b b

2-D Witness Table P Witness Table a b * A 2-D Witness table is a table of size m2, storing a witness for each location of P. P Witness Table a b * 4,3 4,2 4,1

2D Witnesses Amir et. al. – 2D witness table can be used for linear time and space alphabet-independent 2D matching. The order of duels is significant, it is done in 2 waves: 1: duel within each column, bottom to top. 2: duel between columns from right to left.

First Truly 2d Algorithm – The Dueling Method (A-Benson- Farach 1991) Once duels are over, the situation is: All potential pattern “starts” agree on overlap. A i.e. all want to see the same symbol in every text location.

Verification Do a forward wave down the columns to label starts of pattern rows. Do a forward wave on each row, beginning anew for each new row. Label positions of mismatch. Kill all candidates that contain a mismatch (using 2 similar backwards waves)

Dueling Method … Time for checking every text element’s correctness: linear. Every candidate with incorrect element in its range is eliminated. Method: The “wave”. Total Time:

2D Dictionary Matching Suppose we are given a set of 2d patterns, called a dictionary. Goal: search for all Patterns in Text simultaneously, in linear time. Bird/Baker can be extended, if all patterns have uniform width. (How?)