Quantum Leap Pattern Matching A New High-Performance Quick Search- Style Algorithm Bruce W. WatsonDerrick KourieLoek Cleophas Stellenbosch University

Slides:



Advertisements
Similar presentations
1 Average Case Analysis of an Exact String Matching Algorithm Advisor: Professor R. C. T. Lee Speaker: S. C. Chen.
Advertisements

CS4026 Formal Models of Computation Running Haskell Programs – power.
Enabling Speculative Parallelization via Merge Semantics in STMs Kaushik Ravichandran Santosh Pande College.
Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.
String Searching Algorithm
The Efficiency of Algorithms
Factor Oracle, Suffix Oracle 1 Factor Oracle Suffix Oracle.
296.3: Algorithms in the Real World
Boyer Moore Algorithm String Matching Problem Algorithm 3 cases Searching Timing.
1 A simple fast hybrid pattern- matching algorithm Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Two Sample Hypothesis Testing for Proportions
1 The Colussi Algorithm Advisor: Prof. R. C. T. Lee Speaker: Y. L. Chen Correctness and Efficiency of Pattern Matching Algorithms Information and Computation,
A Fast String Matching Algorithm The Boyer Moore Algorithm.
Boyer-Moore string search algorithm Book by Dan Gusfield: Algorithms on Strings, Trees and Sequences (1997) Original: Robert S. Boyer, J Strother Moore.
Smith Algorithm Experiments with a very fast substring search algorithm, SMITH P.D., Software - Practice & Experience 21(10), 1991, pp Adviser:
Quick Search Algorithm A very fast substring search algorithm, SUNDAY D.M., Communications of the ACM. 33(8),1990, pp Adviser: R. C. T. Lee Speaker:
Recuperació de la informació Modern Information Retrieval (1999) Ricardo-Baeza Yates and Berthier Ribeiro-Neto Flexible Pattern Matching in Strings (2002)
The Zhu-Takaoka Algorithm
Raita Algorithm T. RAITA Advisor: Prof. R. C. T. Lee
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
A Fast Algorithm for Multi-Pattern Searching Sun Wu, Udi Manber May 1994.
String Matching. Problem is to find if a pattern P[1..m] occurs within text T[1..n] Simple solution: Naïve String Matching –Match each position in the.
MCS312: NP-completeness and Approximation Algorithms
A performance analysis of multicore computer architectures Michel Schelske.
Tasks and Training the Intermediate Age Students for Informatics Competitions Emil Kelevedjiev Zornitsa Dzhenkova BULGARIA.
SPANISH CRYPTOGRAPHY DAYS (SCD 2011) A Search Algorithm Based on Syndrome Computation to Get Efficient Shortened Cyclic Codes Correcting either Random.
Finite State Machines Chapter 5. Languages and Machines.
Advanced Algorithm Design and Analysis (Lecture 3) SW5 fall 2004 Simonas Šaltenis E1-215b
MA/CSSE 473 Day 24 Student questions Quadratic probing proof
CPSC 335 Randomized Algorithms Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Selection Control Structures. Simple Program Design, Fourth Edition Chapter 4 2 Objectives In this chapter you will be able to: Elaborate on the uses.
Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Erasure Coding for Real-Time Streaming Derek Leong and Tracey Ho California Institute of Technology Pasadena, California, USA ISIT
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
Design and Analysis of Algorithms - Chapter 71 Space-time tradeoffs For many problems some extra space really pays off: b extra space in tables (breathing.
String Searching CSCI 2720 Spring 2007 Eileen Kraemer.
PatternHunter: A Fast and Highly Sensitive Homology Search Method Bin Ma Department of Computer Science University of Western Ontario.
Bahareh Sarrafzadeh 6111 Fall 2009
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Evaluating and Optimizing IP Lookup on Many Core Processors Author: Peng He, Hongtao Guan, Gaogang Xie and Kav´e Salamatian Publisher: International Conference.
28 Aug, 2006PSC Song Classification for Dancing Manolis Cristodoukalis, Costas Iliopoulos, M. Sohel Rahman, W.F. Smyth.
Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu RAIK 283: Data Structures.
MA/CSSE 473 Day 25 Student questions Boyer-Moore.
Nondeterministic Finite State Machines Chapter 5.
1/39 COMP170 Tutorial 13: Pattern Matching T: P:.
A new matching algorithm based on prime numbers N. D. Atreas and C. Karanikas Department of Informatics Aristotle University of Thessaloniki.
Stochastic Local Search Algorithms for DNA word Design Dan C Tulpan, Holger H Hoos, and Anne Condon Summerized by Ji-Eun Yun.
Rabin & Karp Algorithm. Rabin-Karp – the idea Compare a string's hash values, rather than the strings themselves. For efficiency, the hash value of the.
Recuperació de la informació Modern Information Retrieval (1999) Ricardo-Baeza Yates and Berthier Ribeiro-Neto Flexible Pattern Matching in Strings (2002)
Paul Cockshott Glasgow
MA/CSSE 473 Day 26 Student questions Boyer-Moore B Trees.
CSC 421: Algorithm Design & Analysis
Source : Practical fast searching in strings
Recuperació de la informació
13 Text Processing Hongfei Yan June 1, 2016.
Learning Sequence Motif Models Using Expectation Maximization (EM)
Rabin & Karp Algorithm.
Program Design Introduction to Computer Programming By:
Knuth-Morris-Pratt KMP algorithm. [over binary alphabet]
Adviser: R. C. T. Lee Speaker: C. W. Cheng National Chi Nan University
CSC 421: Algorithm Design & Analysis
Pattern Matching 1/14/2019 8:30 AM Pattern Matching Pattern Matching.
A Comparison-FREE SORTING ALGORITHM ON CPUs
Knuth-Morris-Pratt Algorithm.
Pattern Matching Pattern Matching 5/1/2019 3:53 PM Spring 2007
Pattern Matching 4/27/2019 1:16 AM Pattern Matching Pattern Matching
Improved Two-Way Bit-parallel Search
MA/CSSE 473 Day 27 Student questions Leftovers from Boyer-Moore
Presentation transcript:

Quantum Leap Pattern Matching A New High-Performance Quick Search- Style Algorithm Bruce W. WatsonDerrick KourieLoek Cleophas Stellenbosch University

Aim and contents Problem Solution sketch and code Examples z choices Benchmarking Conclusions Future work

Problem Single keyword exact pattern matching Given text t and pattern p (lengths n, m resp.), Find all occurences of p in t Recall Sunday’s Quick Search (QS) Shift bounded above by m+1 This is a family of algorithms – could’ve been Horspool

Ensuring z is worthwhile

Possible values for z

Consequences of z choices

Benchmarking 17-inch Macbook Pro, Intel Core i7 Quad-core. C Code, g++ LLVM, –O3 Bible and Ecoli (each approx 4MB) from SMART Random p taken from t Per m = 1,... 32, 256, 1024: – 30 randomly selected patterns 5 runs over the same data

Best case QLQS versus QS

Conclusions QLQS outperforms QS in most cases with an appropriate choice of z QLQS significantly outperforms when p and t alphabets are disjoint Large z choices appear to violate m+1 principle but QLQS does same table lookups as QS Significant instruction-level parallelism QLQS is as simple as QS Shift tables are easily computed First left to right algorithm using backward shifts? QLQS is speculative execution (take a Quantum Leap/shift, then check if it was valid)

Future work Probabilistic QLQS – validity a z shift not checked. Coarse-grained parallelism Benchmark QLQS using two dimensional shift tables (ZT and BR) Characterize QLQS on CPUs with little ILP Use Quantum Leap principle in other Boyer- Moore style algorithms multiple keyword, regex, tree, … Shift tables in QLQS formally derived in a correctness-by- construction formalism.

Thanks!