GRAMMATICAL INFERENCE OF LAMBDA-CONFLUENT CONTEXT REWRITING SYSTEMS Peter Černo Department of Computer Science Charles University in Prague, Faculty of.

Slides:



Advertisements
Similar presentations
Problems and Their Classes
Advertisements

CS 345: Chapter 9 Algorithmic Universality and Its Robustness
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
Lecture 24 MAS 714 Hartmut Klauck
Copyright © Cengage Learning. All rights reserved. CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
The Big Picture Chapter 3. We want to examine a given computational problem and see how difficult it is. Then we need to compare problems Problems appear.
Peter Černo and František Mráz. Introduction Δ -Clearing Restarting Automata: Restricted model of Restarting Automata. In one step (based on a limited.
January 5, 2015CS21 Lecture 11 CS21 Decidability and Tractability Lecture 1 January 5, 2015.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Introduction to Computability Theory
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
CSE115/ENGR160 Discrete Mathematics 04/12/11 Ming-Hsuan Yang UC Merced 1.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY Read sections 7.1 – 7.3 of the book for next time.
The Theory of NP-Completeness
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
Theory of Computing Lecture 20 MAS 714 Hartmut Klauck.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Defining Polynomials p 1 (n) is the bound on the length of an input pair p 2 (n) is the bound on the running time of f p 3 (n) is a bound on the number.
Design and Analysis of Algorithms
Induction and recursion
LIMITED CONTEXT RESTARTING AUTOMATA AND MCNAUGHTON FAMILIES OF LANGUAGES Friedrich Otto Peter Černo, František Mráz.
Diploma Thesis Clearing Restarting Automata Peter Černo RNDr. František Mráz, CSc.
Complexity and Computability Theory I Lecture #13 Instructor: Rina Zviel-Girshin Lea Epstein Yael Moses.
DIPLOMA THESIS Peter Černo Clearing Restarting Automata Supervised by RNDr. František Mráz, CSc.
Learning Restricted Restarting Automata Presentation for the ABCD workshop in Prague, March 27 – 29, 2009 Peter Černo.
Learning DFA from corrections Leonor Becerra-Bonache, Cristina Bibire, Adrian Horia Dediu Research Group on Mathematical Linguistics, Rovira i Virgili.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
 Computability Theory Turing Machines Professor MSc. Ivan A. Escobar
© M. Winter COSC/MATH 4P61 - Theory of Computation COSC/MATH 4P61 Theory of Computation Michael Winter –office: J323 –office hours: Mon & Fri, 10:00am-noon.
Review Byron Gao. Overview Theory of computation: central areas: Automata, Computability, Complexity Computability: Is the problem solvable? –solvable.
CLEARING RESTARTING AUTOMATA AND GRAMMATICAL INFERENCE Peter Černo Department of Computer Science Charles University in Prague, Faculty of Mathematics.
Theory of Computing Lecture 17 MAS 714 Hartmut Klauck.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
Introduction to CS Theory Lecture 15 –Turing Machines Piotr Faliszewski
Sequences – Page 1CSCI 1900 – Discrete Structures CSCI 1900 Discrete Structures Sequences Reading: Kolman, Section 1.3.
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
Decidable Questions About Regular languages 1)Membership problem: “Given a specification of known type and a string w, is w in the language specified?”
Equations, Inequalities, and Mathematical Models 1.2 Linear Equations
Recursive Algorithms &
Computational Learning Theory IntroductionIntroduction The PAC Learning FrameworkThe PAC Learning Framework Finite Hypothesis SpacesFinite Hypothesis Spaces.
Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
 2005 SDU Lecture13 Reducibility — A methodology for proving un- decidability.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 7 Time complexity Contents Measuring Complexity Big-O and small-o notation.
Probabilistic Automaton Ashish Srivastava Harshil Pathak.
NP-Completness Turing Machine. Hard problems There are many many important problems for which no polynomial algorithms is known. We show that a polynomial-time.
Representing Languages by Learnable Rewriting Systems Rémi Eyraud Colin de la Higuera Jean-Christophe Janodet.
Lecture 16b Turing Machines Topics: Closure Properties of Context Free Languages Cocke-Younger-Kasimi Parsing Algorithm June 23, 2015 CSCE 355 Foundations.
Chapter 5 Finite Automata Finite State Automata n Capable of recognizing numerous symbol patterns, the class of regular languages n Suitable for.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 3 The Church-Turing Thesis Contents Turing Machines definitions, examples,
CS623: Introduction to Computing with Neural Nets (lecture-7) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Chapter 2 1. Chapter Summary Sets (This Slide) The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions.
Theory of Computation Automata Theory Dr. Ayman Srour.
Lecture #5 Advanced Computation Theory Finite Automata.
Chapters 11 and 12 Decision Problems and Undecidability.
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
CIS Automata and Formal Languages – Pei Wang
From Classical Proof Theory to P vs. NP
Context Sensitive Languages and Linear Bounded Automata
Great Theoretical Ideas in Computer Science
Course 2 Introduction to Formal Languages and Automata Theory (part 2)
CSCE 411 Design and Analysis of Algorithms
High-Level Abstraction of Concurrent Finite Automata
Alternating tree Automata and Parity games
CIS Automata and Formal Languages – Pei Wang
Presentation transcript:

GRAMMATICAL INFERENCE OF LAMBDA-CONFLUENT CONTEXT REWRITING SYSTEMS Peter Černo Department of Computer Science Charles University in Prague, Faculty of Mathematics and Physics

Table of Contents Part I:Motivation, Part II:Definitions, Part III:Learning Algorithm, Part IV:Results, Part V:Concluding Remarks.

Part I Motivation

Part I: Motivation Let’s say that we have the following sentence: Andrej, Monika and Peter like kitesurfing. We would like to verify the syntactical correctness of this sentence. One way to do this is to use Analysis by Reduction.

Part I: Motivation Analysis by Reduction – Step-wise simplifications. Andrej, Monika and Peter like kitesurfing. Andrej and Peter like kitesurfing. They like kitesurfing.

Part I: Motivation But how can we learn these reductions?

Part I: Motivation Let’s say that we are lucky and have the following two sentences in our database: Andrej, Monika and Peter like kitesurfing. Andrej and Peter like kitesurfing.

Part I: Motivation From these two samples we can, for instance, infer the following instruction: Andrej, Monika and Peter like kitesurfing. Andrej and Peter like kitesurfing. Instruction:, Monika → λ

Part I: Motivation But is the instruction (,Monika → λ ) correct?

Part I: Motivation But is the instruction (,Monika → λ ) correct? Probably not: Peter goes with Andrej, Monika stays at home, and … Peter goes with Andrej stays at home, and …

Part I: Motivation ¢ Andrej and Peter What we need to do is to capture a context in which the instruction (,Monika → λ ) is applicable: Andrej, Monika and Peter like kitesurfing. Andrej and Peter like kitesurfing., Monika → λ

Part II Definitions

Part II: Definitions Context Rewriting System ( CRS ) Is a triple M = (Σ, Γ, I) : Σ … input alphabet, Γ … working alphabet, Γ ⊇ Σ, ¢ and $ … sentinels, ¢, $ ∉ Γ, I … finite set of instructions (x, z → t, y) : x ∊ {λ, ¢}.Γ * (left context) y ∊ Γ *.{λ, $} (right context) z ∊ Γ +, z ≠ t ∊ Γ*. The width of instruction φ = (x, z → t, y) is |φ| = |xzty|.

Part II: Definitions – Rewriting uzv ⊢ M utv iff ∃ (x, z → t, y) ∊ I : x is a suffix of ¢.u and y is a prefix of v.$. L(M) = {w ∊ Σ* | w ⊢ * M λ}.

Part II: Definitions – Empty Word Note: For every CRS M: λ ⊢ * M λ, hence λ ∊ L(M). Whenever we say that a CRS M recognizes a language L, we always mean that L(M) = L ∪ {λ}. We simply ignore the empty word in this setting.

Part II: Definitions – Example L = {a n cb n | n > 0} ∪ {λ} : CRS M = ({a, b, c}, I), Instructions I are: R1 = (a, acb → c, b), R2 = (¢, acb → λ, $).

Part II: Definitions – Restrictions Context Rewriting Systems are too powerful. We consider the following restrictions: 1. Length of contexts = constant k. All instructions φ = (x, z → t, y) satisfy: x ∊ LC k := Γ k ∪ {¢}.Γ ≤ k - 1 (left context) y ∊ RC k := Γ k ∪ Γ ≤ k - 1.{$} (right context) In case k = 0 we use LC k = RC k = {λ}. We use the notation: k-CRS. 2. Width of instructions ≤ constant l. All instructions φ = (x, z → t, y) satisfy: |φ| = |xzty| ≤ l. We use the notation: (k, l)-CRS.

Part II: Definitions – Restrictions Context Rewriting Systems are too powerful. We consider the following restrictions: 3. Restrict instruction-rules z → t. There are too many possibilities: All instructions φ = (x, z → t, y) satisfy: a) t = λ,(Clearing Restarting Automata) b) t is a subword of z,(Subword-Clearing Restarting Automata) c) |t| ≤ Lambda-confluence. We restrict the whole model to be lambda-confluent. Fast membership queries, undecidable verification. In addition, we assume no auxiliary symbols: Γ = Σ.

Part III Learning Algorithm

Part III: Learning Algorithm Consider a class ℳ of restricted CRS. Goal: Learning ℒ(ℳ) from informant. Identify any hidden target CRS from ℳ in the limit from positive and negative samples. Input: Set of positive samples S +, Set of negative samples S -, We assume that S + ∩ S - = ⍉, and λ ∊ S +. Output: CRS M from ℳ such that: L(M) ⊆ S + and L(M) ∩ S - = ⍉.

Part III: Learning Restrictions Without restrictions: Trivial even for Clearing Restarting Automata. Consider: I = { (¢, w → λ, $) | w ∊ S +, w ≠ λ }. Apparently: L(M) = S +, where M = (Σ, Σ, I). Therefore, we impose: An upper limit l ≥ 1 on the width of instructions.

Part III: Learning Algorithm Look at Positive Samples and Infer Instruction Candidates Look also at Negative Samples and Remove Bad Instructions Simplify and Check Consistency

Part III: Learning Algorithm Infer ℳ Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 1/5 Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 1/5 Step 1: First, we obtain some set of instruction candidates. Let us assume, for a moment, that this set already contains all instructions of the hidden target CRS.

Part III: Learning Algorithm – Step 2/5 Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 2/5 Step 2: We gradually remove all instructions that allow a single-step reduction from a negative sample to a positive sample. Such instructions violate the so-called error-preserving property.

Part III: Learning Algorithm – Step 3/5 Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 3/5 Step 3: If the target class ℳ consists of lambda-confluent CRS : We gradually remove all instructions that allow a single-step reduction from a positive sample to a negative sample. Such instructions violate the so-called correctness-preserving property.

Part III: Learning Algorithm – Step 4/5 Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 4/5 Step 4: We remove the redundant instructions. This step is optional and can be omitted – it does not affect the properties or the correctness of the Learning Algorithm.

Part III: Learning Algorithm – Step 5/5 Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Maximal width of instructions l ≥ 1, Specific length of contexts k ≥ 0.

Part III: Learning Algorithm – Step 5/5 Step 5: We check the consistency of the remaining set of instructions with the given input set of positive and negative samples.

Part III: Complexity Time complexity of the Algorithm depends on: Time complexity of the function Assumptions, Time complexity of the simplification, Time complexity of the consistency check. There are correct implementations of the function Assumptions that run in polynomial time. The simplification and the consistency check can be done in polynomial time when using lambda-confluent CRS. Otherwise, it is an open problem.

Part III: Assumptions We call the function Assumptions correct, if it is possible to obtain all instructions of any hidden target CRS in the limit by using this function. To be more precise: For every minimal (k, l)-CRS M there exists a finite set S 0 + ⊆ L(M) such that for every S + ⊇ S 0 + the Assumptions(S +, l, k) contains all instructions of M.

Part III: Example – Assumptions weak Assumptions weak (S +, l, k) := all instructions (x, z → t, y) : The length of contexts is k : x ∊ Σ k ∪ {¢}. Σ ≤ k - 1 (left context) y ∊ Σ k ∪ Σ ≤ k - 1.{$} (right context) The width is bounded by l : |xzty| ≤ l. The rule z → t satisfies all rule restrictions. There are two words w 1, w 2 ∊ S + such that: xzy is a subword of ¢ w 1 $, xty is a subword of ¢ w 2 $. This function is correct and runs in a polynomial time.

Part III: Example – Assumptions weak

Part IV Results

Part IV: Results ℳ – class of restricted (k, l)-CRS, M – a model from ℳ, Then there exist: Finite sets S 0 +, S 0 - of positive, negative samples: For every S + ⊇ S 0 +, S - ⊇ S 0 - consistent with M : Infer ℳ (S +, S -, k, l) = N : L(N) = L(M). Positive side: The class ℒ(ℳ) is learnable in the limit from informant. Negative side: size(S 0 +, S 0 - ) can be exponentially large w.r.t. size(M). We do not know k, l. If l is specified, ℒ(ℳ) is finite!

Part IV: Unconstrained Learning Input: Positive samples S +, negative samples S -, S + ∩ S - = ⍉, λ ∊ S +. Specific length of contexts k ≥ 0.

Part IV: Results ℳ – class of restricted k-CRS, M – a model from ℳ, Then there exist: Finite sets S 0 +, S 0 - of positive, negative samples: For every S + ⊇ S 0 +, S - ⊇ S 0 - consistent with M : UnconstrainedInfer ℳ (S +, S -, k) = N : L(N) = L(M). N has minimal width! Positive side: The infinite class ℒ(ℳ) is learnable in the limit from informant. Negative side: size(S 0 +, S 0 - ) can be exponentially large w.r.t. size(M). We do not know k.

Part V Concluding Remarks

Part V: Concluding Remarks Remarks: We have shown that ℒ(ℳ) is learnable in the limit from informant for any class ℳ of restricted k-CRS. UnconstrainedInfer ℳ (S +, S -, k) always returns a model consistent with the given input S +, S -. In the worst case it returns: I = { (¢, w → λ, $) | w ∊ S +, w ≠ λ }. This is not true for Infer ℳ (S +, S -, k, l), (it can Fail). In some cases, finding a consistent model with maximal width l is NP-hard. If ℳ is a class of lambda-confluent k-CRS, then UnconstrainedInfer runs in polynomial time w.r.t. size(S +, S - ). But in most cases, it is not possible to verify lambda- confluence. It is often not even recursively enumerable. If ℳ is a class of ordinary k-CRS, the time complexity of UnconstrainedInfer is an open problem.

Selected References M. Beaudry, M. Holzer, G. Niemann, and F. Otto. Mcnaughton families of languages. Theoretical Computer Science, 290(3): , Ronald V Book and Friedrich Otto. String-rewriting systems. Springer-Verlag, New York, NY, USA, Peter Černo. Clearing restarting automata and grammatical inference. In: J. HEINZ, C. DE LA HIGUERA, T. OATES (eds.), Proceedings of the Eleventh International Conference on Grammatical Inference. JMLR Workshop and Conference Proceedings 21, 2012, Peter Černo and František Mráz. Clearing restarting automata. Fundamenta Informaticae, 104(1):17-54, C. de la Higuera. Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, New York, NY, USA, R. Eyraud, C. de la Higuera, and J.-C. Janodet. Lars: A learning algorithm for rewriting systems. Machine Learning, 66:7-31, E. Mark Gold. Complexity of automaton identification from given data. Information and Control, 37, John E. Hopcroft and J. D. Ullman. Formal Languages and their Relation to Automata. Addison-Wesley, Reading, S. Lange, T. Zeugmann, and S. Zilles. Learning indexed families of recursive languages from positive data: A survey. Theor. Comput. Sci., 397(1-3): , May R. McNaughton. Algebraic decision procedures for local testability. Theory of Computing Systems, 8:60-76, F. Otto. Restarting automata. In Zoltán Ésik, Carlos Martín-Vide, and Victor Mitrana, editors, Recent Advances in Formal Languages and Applications, volume 25 of Studies in Computational Intelligence, pages Springer, Berlin, F. OTTO, F. MRAZ, Lambda-Conuence is Undecidable for Clearing Restarting Automata. In: CIAA 2013, Proceedings. LNCS 7982, Berlin, 2013,

Thank You! This presentation is available on: An implementation of the algorithms can be found on: