Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.

Slides:



Advertisements
Similar presentations
Chapter 5 Pushdown Automata
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
YES-NO machines Finite State Automata as language recognizers.
1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet: Languages.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
Fall 2006Costas Busch - RPI1 Deterministic Finite Automata And Regular Languages.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
1 Languages and Finite Automata or how to talk to machines...
CHAPTER 4 Decidability Contents Decidable Languages
Snick  snack CPSC 121: Models of Computation 2010 Winter Term 2 DFAs in Depth Benjamin Israel Notes heavily borrowed from Steve Wolfman’s,
Normal forms for Context-Free Grammars
Topics Automata Theory Grammars and Languages Complexities
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Great Theoretical Ideas in Computer Science.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Finite-State Machines with No Output
DIPLOMA THESIS Peter Černo Clearing Restarting Automata Supervised by RNDr. František Mráz, CSc.
Lecture 23: Finite State Machines with no Outputs Acceptors & Recognizers.
CSCI 2670 Introduction to Theory of Computing September 28, 2005.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
INTRODUCTION TO THE THEORY OF COMPUTATION INTRODUCTION MICHAEL SIPSER, SECOND EDITION 1.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
 Computability Theory Turing Machines Professor MSc. Ivan A. Escobar
CSC312 Automata Theory Lecture # 2 Languages.
1 Chapter 1 Automata: the Methods & the Madness Angkor Wat, Cambodia.
Compiler Construction Lexical Analysis. The word lexical means textual or verbal or literal. The lexical analysis implemented in the “SCANNER” module.
CLEARING RESTARTING AUTOMATA AND GRAMMATICAL INFERENCE Peter Černo Department of Computer Science Charles University in Prague, Faculty of Mathematics.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
Pushdown Automata (PDAs)
Grammars CPSC 5135.
GRAMMATICAL INFERENCE OF LAMBDA-CONFLUENT CONTEXT REWRITING SYSTEMS Peter Černo Department of Computer Science Charles University in Prague, Faculty of.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University 2006.
Turing Machines Chapter 17. Languages and Machines SD D Context-Free Languages Regular Languages reg exps FSMs cfgs PDAs unrestricted grammars Turing.
Theory of Computation, Feodor F. Dragan, Kent State University 1 TheoryofComputation Spring, 2015 (Feodor F. Dragan) Department of Computer Science Kent.
Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 203: Introduction to Formal Languages and Automata
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
Probabilistic Automaton Ashish Srivastava Harshil Pathak.
Modeling Computation: Finite State Machines without Output
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Representing Languages by Learnable Rewriting Systems Rémi Eyraud Colin de la Higuera Jean-Christophe Janodet.
Learning regular tree languages from correction and equivalence queries Cătălin Ionuţ Tîrnăucă Research Group on Mathematical Linguistics, Rovira i Virgili.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 3 The Church-Turing Thesis Contents Turing Machines definitions, examples,
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Set, Alphabets, Strings, and Languages. The regular languages. Clouser properties of regular sets. Finite State Automata. Types of Finite State Automata.
Akram Salah ISSR Basic Concepts Languages Grammar Automata (Automaton)
Theory of Computation Automata Theory Dr. Ayman Srour.
Lecture 11  2004 SDU Lecture7 Pushdown Automaton.
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Languages.
CIS Automata and Formal Languages – Pei Wang
Deterministic Finite Automata
Jaya Krishna, M.Tech, Assistant Professor
Deterministic Finite Automata
REGULAR LANGUAGES AND REGULAR GRAMMARS
A HIERARCHY OF FORMAL LANGUAGES AND AUTOMATA
Intro to Data Structures
Compiler Construction
CSC312 Automata Theory Lecture # 2 Languages.
Presentation transcript:

Learning Automata and Grammars Peter Černo

 The problem of learning or inferring automata and grammars has been studied for decades and has connections to many disciplines:  Bio-informatics.  Computational linguistics.  Pattern recognition. Learning Automata and Grammars

 In this presentation we:  Introduce the formal language theory (FLT).  Emphasize the importance of learnability.  Explain the identification in the limit.  Give as an example the algorithm LARS. Learning Automata and Grammars

 The central notion in FLT is a (formal) language which is a finite or infinite set of words.  Word is a finite sequence consisting of zero or more letters. The same letter may occur several times.  The sequence of zero letters = the empty word λ.  We restrict ourselves to some specific alphabet, which is a finite nonempty set of letters.  The set of all words from Σ is denoted as Σ*. Formal Language Theory

 How to define a language?  Acceptors: usually automata – they are given an input word and after some processing they either accept or reject this input word.  Generators: usually grammars – they generate the language using some finite set of rules.  We want to learn automata / grammars under suitable learning regime. Formal Language Theory

 Grammar induction: finding a grammar (or automaton) that can explain the data.  Grammatical inference: relies on the fact that there is a (true) target grammar (or automaton), and that the quality of the learning process has to be measured relatively to this target. Induction / Inference

 Alexander Clark in [Clark, A.: Three learnable models for the description of language] emphasized the importance of learnability.  He proposed that one way to build learnable representations is by making them objective or empiricist: the structure of the representation should be based on the structure of the language. Importance of Learnability

 In defining these representation classes the author followed a simple slogan: “Put learnability first!”  In the conclusive remarks the author suggested that the representations, which are both efficiently learnable and capable of representing mildly context- sensitive languages seem to be good candidates for models of human linguistic competence. Importance of Learnability

 In a typical grammatical inference scenario we are concerned with learning language representations based on some source of information:  Text.  Examples and counter-examples.  Etc.  We assume a perfect source of information. General Setting

 Let us fix the alphabet Σ.  Let ℒ be a language class.  Let ℛ be a class of representations for ℒ.  Let L: ℛ → ℒ be the naming function, i.e. L(R) is the language denoted, accepted, recognized or represented by the object R ∊ ℛ. General Setting

 There are two important problems:  Membership problem: given w ∊ Σ* and R ∊ ℛ, is the query w ∊ L(R) decidable?  Equivalence problem: given R 1, R 2 ∊ ℛ, is the query L(R 1 ) = L(R 2 ) decidable? General Setting

 A presentation is an enumeration of elements, which represents a source of information about some specific language L ∊ ℒ.  For instance, the enumeration of all positive and negative samples of L (in some order).  A learning algorithm A is a program that takes the first n elements of a presentation (denoted as n ) and returns some object R ∊ ℛ. Identification in the Limit

 We say that ℛ is identifiable in the limit if there exists a learning algorithm A such that for any target object R ∊ ℛ and any presentation of L(R) there exists a rank m such that for all n ≥ m A( n ) does not change and L(A( n )) = L(R).  The above definition does not force us to learn the target object, but only to learn an object equivalent to the target. Identification in the Limit

 However, there are some complexity issues with the identification in limit:  It neither tells us how we know when we have found what we are looking for nor how long it is going to take.  We illustrate this methodology on the so-called delimited string-rewriting systems.  The learning algorithm is called LARS. Identification in the Limit

 String rewriting systems are usually specified by:  Some rewriting mechanism,  Some base of simple (accepted) words.  Let us introduce two special symbols that do not belong to our alphabet Σ :  ¢ left sentinel,  $ right sentinel. String Rewriting Systems

 Term is a string from T(Σ) = {λ, ¢}.Σ*.{λ, $}.  Term can be of one of the following types:  Type 1: w ∊ Σ* (substring)  Type 2: w ∊ ¢.Σ* (prefix)  Type 3: w ∊ Σ*.$ (suffix)  Type 4: w ∊ ¢.Σ*.$ (whole string)  Given a term w, the root of w is w without sentinels. String Rewriting Systems

 We define an order relation over T(Σ) :  We define u < v if and only if:  root(u) < lex-length root(v) or  root(u) = root(v) and type(u) < type(v).  For instance, for Σ = {a, b} :  ab < ¢ab < ab$ < ¢ab$ < ba String Rewriting Systems

 A rewrite rule is an ordered pair ρ = (l, r), generally written as ρ = l ⊢ r, where:  l is the left-hand side of ρ and  r is the right-hand side of ρ.  We say that ρ = l ⊢ r is a delimited rewrite rule if l and r are of the same type.  Delimited string-rewriting system (DSRS) ℛ is a finite set of delimited rewrite rules. String Rewriting Systems

 The order extends to rules:  We define (l 1, r 1 ) < (l 2, r 2 ) if and only if:  l 1 < l 2 or  l 1 = l 2 and r 1 < r 2.  A system is deterministic if not two rules share a common left-hand side. String Rewriting Systems

 Given a DSRS ℛ and a string w, there may be several applicable rules.  Nevertheless, only one rule is eligible.  This is the rule having the smallest left-hand side.  This rule might be eligible in different places. We privilege the leftmost position. String Rewriting Systems

 Given a DSRS ℛ and strings w 1, w 2 ∊ T(Σ), we say that w 1 rewrites in one step into w 2, i.e. w 1 ⊢ ℛ w 2 ( w 1 ⊢ w 2 ), if there exists an eligible rule (l ⊢ r) ∊ ℛ :  w 1 = ulv, w 2 = urv, and  u is shortest for this rule.  String w is reducible if there exists a string w’ such that w ⊢ w’, and irreducible otherwise. String Rewriting Systems

 We denote by ⊢ * ℛ the reflexive and transitive closure of ⊢ ℛ. We say that w 1 reduces to w 2 or that w 2 is derivable from w 1 if w 1 ⊢ * ℛ w 2.  Given a system ℛ and an irreducible string e ∊ Σ*, we define the language: L(ℛ, e) = {w ∊ Σ* | ¢w$ ⊢ * ℛ ¢e$}. String Rewriting Systems

 Example:  L({ab ⊢ λ}, λ) is the Dyck language, i.e.: ¢abaabb$ ⊢ ¢aabb$ ⊢ ¢ab$ ⊢ ¢λ$.  L({aabb ⊢ ab, ¢ab$ ⊢ ¢λ$}, λ) = {a n b n | n ≥ 0}, i.e.: ¢aaabbb$ ⊢ ¢aabb$ ⊢ ¢ab$ ⊢ ¢λ$.  L({¢ab ⊢ ¢}, λ) is the regular language (ab)*.  It can be shown that any regular language can be represented in this way. String Rewriting Systems

 Deciding whether a string w belongs to a language L(ℛ, e) consists of trying to obtain e from w.  We will denote by APPLY(ℛ, w) the string obtained by applying different rules in ℛ until no more rules can be applied.  This can be naturally extended to sets: APPLY(ℛ, S) = { APPLY(ℛ, w) | w ∊ S }. String Rewriting Systems

 Learning Algorithm for Rewriting Systems.  Generates the possible rules that can be applied over the positive data S +.  Tries using them and keeps them if they do not create inconsistency (using the negative data S - for that).  Algorithm calls the function NEWRULE, which generates the next possible rule. Algorithm LARS

 One should choose useful rules, i.e. those that can be applied on at least one string from positive data S +.  Moreover, a rule should allow to diminish the size of the set S + (i.e. two different strings rewrite into an identical string).  The function CONSISTENT checks the consistency of the system. Algorithm LARS

 The goal is to be able to learn any DSRS with LARS.  The simplified version proposed here does identify in the limit any DSRS.  Formal study of the algorithm is beyond scope of this presentations. Algorithm LARS

 Input: S +, S -.  Output: ℛ.  ℛ := ∅; ρ := (λ ⊢ λ);  while |S + | > 1 do  ρ := NEWRULE(S +, ρ);  if CONSISTENT(S +, S -, ℛ ∪ {ρ}) then  ℛ := ℛ ∪ {ρ};  S + := APPLY(ℛ, S + ); S - := APPLY(ℛ, S - ); Algorithm LARS

 Alexander Clark (2010): Three Learnable Models for the Description of Language.  Colin de la Higuera (2010): Grammatical Inference Learning Automata and Grammars References