Data-Parallel Finite-State Machines Todd Mytkowicz, Madanlal Musuvathi, and Wolfram Schulte Microsoft Research.

Slides:



Advertisements
Similar presentations
Requirements for a UI Test Framework Stanislaw Wozniak Bernie Miles.
Advertisements

An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
CDA 3100 Recitation Week 10.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
The Big Picture Chapter 3. We want to examine a given computational problem and see how difficult it is. Then we need to compare problems Problems appear.
Multimodal Interaction. Modalities vs Media Modalities are ways of encoding information e.g. graphics Media are instantiations of modalities e.g. a particular.
1 Lecture 2 Topics –Importance of this material Fundamental Limitations –Connecting Problems and Languages Problems –Search, function and decision problems.
Contemporary Logic Design FSM Optimization © R.H. Katz Transparency No Chapter #9: Finite State Machine 9.4 Choosing Flip-Flops 9.5 Machine Partitioning.
Data Compression Arithmetic coding. Arithmetic Coding: Introduction Allows using “fractional” parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip.
Dale & Lewis Chapter 3 Data Representation
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
1 Computational Linguistics Ling 200 Spring 2006.
Finite-State Machines with Output
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Finite State Machines Chapter 5. Languages and Machines.
Introduction to Theory of Automata
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
CSE 311 Foundations of Computing I Lecture 23 Finite State Machines Autumn 2012 CSE 3111.
CSE 311 Foundations of Computing I Lecture 21 Finite State Machines Autumn 2011 CSE 3111.
CSE 311 Foundations of Computing I Lecture 21 Finite State Machines Spring
LING/C SC/PSYC 438/538 Lecture 7 9/15 Sandiway Fong.
Huffman Coding and Decoding TAIABUL HAQUE NAEEMUL HASSAN.
1 Generating FSMs from Abstract State Machines Wolfgang Grieskamp Yuri Gurevich Wolfram Schulte Margus Veanes Foundations of Software Engineering Microsoft.
Pushdown Automata Part I: PDAs Chapter Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2)
Инвестиционный паспорт Муниципального образования «Целинский район»
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
(x – 8) (x + 8) = 0 x – 8 = 0 x + 8 = x = 8 x = (x + 5) (x + 2) = 0 x + 5 = 0 x + 2 = x = - 5 x = - 2.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
Copyright © Curt Hill Finite State Automata Again This Time No Output.
The Simplest NL Applications: Text Searching and Pattern Matching Read J & M Chapter 2.
Who Needs All Those Indexes ? One is Enough Bruce Lindsay IBM Almaden Research Center
Parallel Data Compression Utility Jeff Gilchrist November 18, 2003 COMP 5704 Carleton University.
Introduction Part I Speech Representation, Models and Analysis Part II Speech Recognition Part III Speech Synthesis Part IV Speech Coding Part V Frontier.
Boolean Functions x 2 x 3 x f mapping truth table.
Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
بسم الله الرحمن الرحيم My Project Huffman Code. Introduction Introduction Encoding And Decoding Encoding And Decoding Applications Applications Advantages.
Compression techniques Adaptive and non-adaptive.
ALPHABET RECOGNITION USING SPHINX-4 BY TUSHAR PATEL.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
Mutual Information, Joint Entropy & Conditional Entropy
CS 461 – Oct. 28 TM applications –Recognize a language √ –Arithmetic √ –Enumerate a set –Encode a data structure or problem to solve Two kinds of TMs –Decider:
Speech Recognition Created By : Kanjariya Hardik G.
LZW (Lempel-Ziv-welch) compression method The LZW method to compress data is an evolution of the method originally created by Abraham Lempel and Jacob.
Nondeterministic Finite State Machines Chapter 5.
Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.
Turing Machines Sections 17.6 – The Universal Turing Machine Problem: All our machines so far are hardwired. ENIAC
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Plan for today Introduction Graph Matching Method Theme Recognition Comparison Conclusion.
照片档案整理 一、照片档案的含义 二、照片档案的归档范围 三、 卷内照片的分类、组卷、排序与编号 四、填写照片档案说明 五、照片档案编目及封面、备考填写 六、数码照片整理方法 七、照片档案的保管与保护.
공무원연금관리공단 광주지부 공무원대부등 공적연금 연계제도 공무원연금관리공단 광주지부. 공적연금 연계제도 국민연금과 직역연금 ( 공무원 / 사학 / 군인 / 별정우체국 ) 간의 연계가 이루어지지 않고 있 어 공적연금의 사각지대가 발생해 노후생활안정 달성 미흡 연계제도 시행전.
Жюль Верн ( ). Я мальчиком мечтал, читая Жюля Верна, Что тени вымысла плоть обретут для нас; Что поплывет судно громадней «Грейт Истерна»; Что.
Computer Science 516 A Little On Finite State Machines.
Math Expression Evaluation With RegEx and Finite State Machines.
String Matching dengan Regular Expression
CSE322 Recursive and Recursively enumerable sets
How do they do that?.
Fei Li Jinjun Xiong University of Wisconsin-Madison
Department of Software & Media Technology
الإعداد: نوشاد و. قسم اللغة العربية، جامعة كيرالا
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
إستراتيجيات ونماذج التقويم
Extracting Semantic Concept Relations
Devanagari Font Support For Linux
Regular Expressions
David Cyphert CS 2310 – Software Engineering
CPSC 503 Computational Linguistics
CPS 296.3:Algorithms in the Real World
Algorithms Lecture # 25 Dr. Sohail Aslam.
Presentation transcript:

Data-Parallel Finite-State Machines Todd Mytkowicz, Madanlal Musuvathi, and Wolfram Schulte Microsoft Research

New method to break data dependencies Preserves program semantics Does not use speculation Generalizes to other domains, but this talk focuses on FSM

FSMs contain an important class of algorithms Unstructured text (e.g., regex matching or lexing) Natural language processing (e.g., Speech Recognition) Dictionary based decoding (e.g., Huffman decoding) Text encoding / decoding (e.g., UTF8) Want parallel versions to all these problems, particularly in the context of large amounts of data

*x/ / x /x* * x * / T/*x Data Dependence limits ILP, SIMD, and multicore parallelism

Demo UTF-8 Encoding

/*XXX**/XX T/*x Enumeration breaks data dependences but how do we make it scale? - Overhead is proportional to # of states Breaking data dependences with enumeration

/*XXX**/XX After 2 characters of input, FSM converges to 2 unique states - Overhead is proportional to # of unique states Intuition: Exploit convergence in enumeration

Convergence for worst case inputs Almost all (90%) FSMs converge to <= 16 states after 10 steps on adversarial inputs However, many FSM take thousands of steps to converge to <= 4 states

Convergence for real inputs All FSM converge to less than 16 states after 20 steps on real input

/*XXX**/XX Why convergence happens FSM has structure Many states transition to an error state on a character FSM often transition to “homing” states after reading sequence of characters e.g., after reading */ the FSM is very likely, though not guaranteed, to reach the “end-of-comment” state.

Contributions Enumeration, a method to break data dependencies Enumeration for FSM is gather Gather is a common hardware primitive Our approach should scale with faster support for gather Paper introduces two optimizations, both in terms of gather which exploit convergence Reduces overhead of enumerative approach See paper for details

How do we implement enumerative FSMs with gather?

/*XXX**/XX Implementing Enumeration with Gather T/*x T/*x Current states are addresses used to gather from T[input]

Enumeration makes FSMs embarrassingly parallel Some hardware has gather as a primitive Our approach will scale with that hardware Some hardware lacks gather Paper shows how to use: _mm_shuffle_epi8 to implement gather in x86 SIMD ILP because gather is associative Multicore with openmp

Single-Core performance Good performanceNot so good performance More hardware to help scaling Hardware gather or multicore parallelism

Bing Tokenization

Case Studies SNORT Regular ExpressionsHuffman Decoding

Related Work Prior parallel approaches Ladner and Fischer (1980) – Cubic in number of states Hillis and Steele (1986) – Linear in number of states Bit Parallelism Parabix – FSM to sequence of bit operations Speculation Prakash and Vaswani (2010) – “Safe speculation” as programming construct

Conclusion Enumeration: A new method to break data dependencies Not speculation and preserves semantics Exploits redundancy, or convergence, in computation to scale Generalizes to other domains (Dynamic Programming in PPOPP 2014) Enumeration for FSM is gather Scales with new hardware implementations of gather Paper demonstrates how to use SIMD, ILP, and Multicore on machines which lack intrinsic support for gather