Peter Černo and František Mráz. Introduction Δ -Clearing Restarting Automata: Restricted model of Restarting Automata. In one step (based on a limited.

Slides:



Advertisements
Similar presentations
Chapter 5 Pushdown Automata
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Closure Properties of CFL's
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
CFGs and PDAs Sipser 2 (pages ). Long long ago…
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
CFGs and PDAs Sipser 2 (pages ). Last time…
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
CS5371 Theory of Computation
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Monotone Deterministic RL-Automata Don’t Need Auxiliary Symbols Supported by grants from DFG and the Grant Agency of the Czech Republic Tomasz Jurdziński.
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 5 Reducibility Contents Undecidable Problems from Language Theory.
Normal forms for Context-Free Grammars
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
CS5371 Theory of Computation Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Chapter 12: Context-Free Languages and Pushdown Automata
LIMITED CONTEXT RESTARTING AUTOMATA AND MCNAUGHTON FAMILIES OF LANGUAGES Friedrich Otto Peter Černo, František Mráz.
Diploma Thesis Clearing Restarting Automata Peter Černo RNDr. František Mráz, CSc.
Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { ‘{‘, ‘}’ } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite language,
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
DIPLOMA THESIS Peter Černo Clearing Restarting Automata Supervised by RNDr. František Mráz, CSc.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
Pushdown Automata (PDA) Intro
CLEARING RESTARTING AUTOMATA AND GRAMMATICAL INFERENCE Peter Černo Department of Computer Science Charles University in Prague, Faculty of Mathematics.
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
Pushdown Automata (PDAs)
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
GRAMMATICAL INFERENCE OF LAMBDA-CONFLUENT CONTEXT REWRITING SYSTEMS Peter Černo Department of Computer Science Charles University in Prague, Faculty of.
Peter Černo.  Suppose we have a sample computation for (delta) clearing restarting automata.  Suppose that the inferred automaton accepts some wrong.
Computability Construct TMs. Decidability. Preview: next class: diagonalization and Halting theorem.
Counting CSC-2259 Discrete Structures Konstantin Busch - LSU1.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Section 12.4 Context-Free Language Topics
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
1 Turing’s Thesis. 2 Turing’s thesis: Any computation carried out by mechanical means can be performed by a Turing Machine (1930)
 2005 SDU Lecture13 Reducibility — A methodology for proving un- decidability.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2007.
1 Alphabets: An Alphabet is a finite set of symbols. We will usually use  to denote the alphabet of input symbols or “terminal characters.” String: A.
98 Nondeterministic Automata vs Deterministic Automata We learned that NFA is a convenient model for showing the relationships among regular grammars,
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Chapter 5 Finite Automata Finite State Automata n Capable of recognizing numerous symbol patterns, the class of regular languages n Suitable for.
Unrestricted Grammars
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
1 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations pop, push, and nop. PDAs always.
 2005 SDU Lecture11 Decidability.  2005 SDU 2 Topics Discuss the power of algorithms to solve problems. Demonstrate that some problems can be solved.
Lecture 14: Theory of Automata:2014 Finite Automata with Output.
Lecture 11  2004 SDU Lecture7 Pushdown Automaton.
Context-Free Grammars: an overview
Programming Languages Translator
Complexity and Computability Theory I
Theorem 29 Given any PDA, there is another PDA that accepts exactly the same language with the additional property that whenever a path leads to ACCEPT,
CHAPTER 2 Context-Free Languages
Presentation transcript:

Peter Černo and František Mráz

Introduction Δ -Clearing Restarting Automata: Restricted model of Restarting Automata. In one step (based on a limited context): Delete a substring, Replace a substring by Δ. The main result: Δ - clearing restarting automata recognize all context-free languages.

Example

Restarting Automata Restarting Automata: Tool for modeling some techniques for natural language processing. Analysis by Reduction: Method for checking [non-]correctness of a sentence. Iterative application of simplifications. Until the input cannot be simplified anymore.

Organization 1. Δ -clearing restarting automata. 2. Δ* -clearing restarting automata. 3. Δ* -clearing restarting automata recognize CFL. 4. Special coding. 5. Reduction: Δ* - to Δ -clearing restarting automata.

Δ -Clearing Restarting Automata Let k be a positive integer. k- Δ -clearing restarting automaton ( k-Δcl-RA ) Is a couple M = (Σ, I) : Σ … input alphabet, ¢, $, Δ ∉ Σ, Γ … working alphabet, Γ = Σ ∪ {Δ} I … finite set of instructions (x, z → t, y) : x ∊ {¢, λ}.Γ *, |x|≤k (left context) y ∊ Γ *.{λ, $}, |y|≤k (right context) z ∊ Γ +, t ∊ {λ, Δ}. ¢ and $ … sentinels.

Rewriting uzv ⊢ M utv iff ∃ φ = (x, z → t, y) ∊ I : x is a suffix of ¢.u and y is a prefix of v.$. L(M) = {w ∊ Σ* | w ⊢ * M λ}. L C (M) = {w ∊ Γ* | w ⊢ * M λ}.

Empty Word Note: For every Δcl-RA M: λ ⊢ * M λ hence λ ∊ L(M). Whenever we say that Δcl-RA M recognizes a language L, we always mean that L(M) = L ∪ {λ}.

Example 1 L 1 = {a n b n | n > 0} ∪ {λ} : 1-Δcl-RA M = ({a, b}, I), Instructions I are: R1 = (a, ab → λ, b), R2 = (¢, ab → λ, $). Note: We did not use Δ.

Example 2 L 2 = {a n cb n | n > 0} ∪ {λ} : 1-Δcl-RA M = ({a, b, c}, I), Instructions I are: R1 = (a, c → Δ, b), R2 = (a, aΔb → Δ, b), R3 = (¢, aΔb → λ, $). Note: We must use Δ.

Organization 1. Δ -clearing restarting automata. 2. Δ* -clearing restarting automata. 3. Δ* -clearing restarting automata recognize CFL. 4. Special coding. 5. Reduction: Δ* - to Δ -clearing restarting automata.

Δ* -Clearing Restarting Automata Δ* -clearing restarting automata Similar to Δ -clearing restarting automata. We allow instructions (x, z → Δ k, y), where k ≤ |z|.

Organization 1. Δ -clearing restarting automata. 2. Δ* -clearing restarting automata. 3. Δ* -clearing restarting automata recognize CFL. 4. Special coding. 5. Reduction: Δ* - to Δ -clearing restarting automata.

Δ*cl-RA and CFL Theorem: For each context-free language L there exists a 1-Δ*cl-RA M recognizing L. Idea. M works in a bottom-up manner. 1. If the input is small, M may “clear” the whole input. 2. If the input is long, M may “replace” some subword by the “code” of nonterminal.

Δ*cl-RA and CFL If the input is small:

Δ*cl-RA and CFL If the input is long:

Δ*cl-RA and CFL Proof. L … context-free language over Σ. G = (V N, V T, S, P) … context-free grammar : G is in: Chomsky normal form, G generates: L(G) = L – {λ}, Nonterminals: V N = {N 1, N 2, …, N m }, Terminals: V T = Σ, Γ = Σ ∪ {Δ}, Start: S = N 1, ¢, $, Δ ∉ V N ∪ V T.

Δ*cl-RA and CFL Proof (Continued). Auxiliary G’ = (V N, V’ T, S, P’) obtained from G : 1. By adding symbol Δ to V T, V’ T = V T ∪ {Δ} = Γ, 2. By adding productions N i → aΔ i b to P, P’ = P ∪ { N i → aΔ i b | i = 1, …, m; a, b ∊ V T }. Our goal: 1-Δ*cl-RA M : L C (M) = L(G’) ∪ {λ}. Then: L(M) = L C (M) ∩ Σ* = L(G) ∪ {λ}.

Δ*cl-RA and CFL Proof (Continued). aΔ i b code for N i ( ∀ a, b ∊ V T ). a, b ∊ V T separators (between codes).

Δ*cl-RA and CFL Proof (Continued). Idea: If z can be derived from N i (in G’ ), Then M can replace z by a “code” for N i. M replaces only the inner part of z by Δ i. M leaves first and last letter of z as separator.

Δ*cl-RA and CFL Idea:

Δ*cl-RA and CFL Proof (Continued). Two problems: 1. | Inner part | ≥ |Δ i |. 2. Finite many instructions. Proposition: For any w ∊ L(G’) : If |w| > c = |V N | + 2, then w = x z y : 1. c < |z| ≤ 2c, 2. S ⇒ * x N i y ⇒ * x z y for some N i.

Δ*cl-RA and CFL Proof (Continued). Construction: I 1 … set of all instructions: (¢, w → λ, $) Where w ∊ L(G’) and |w| ≤ c. This resolves the “small” inputs.

Δ*cl-RA and CFL Proof (Continued). Construction: For every N i ⇒ * z 1 … z s, where c < s ≤ 2c : (z 1, z 2 … z s-1 → Δ i, z s ) I 2 … set of all such instructions. I 1, I 2 … finite sets of instructions. M = (Σ, I 1 ∪ I 2 ) … required automaton. Q.E.D. ∎

Generalization We can choose:

Generalization Observation: For every t ≥ 1 … ∃ m 1, k : z contains v ∊ Σ ≥ t … empty space.

Trivial Reduction Why empty space? Trivial simulation:

Trivial Reduction Why empty space? Partial Δ -instructions: φ 1 = (x, z 1 → Δ, z 2 z 3 … z s y), φ 2 = (x Δ, z 2 → Δ, z 3 … z s y), … φ r = (x Δ r-1, z r … z s → Δ, y). Problem: The equivalence is not guaranteed.

Avoiding Conflicts How to avoid conflicts? We can encode some extra information into z.

Organization 1. Δ -clearing restarting automata. 2. Δ* -clearing restarting automata. 3. Δ* -clearing restarting automata recognize CFL. 4. Special coding. 5. Reduction: Δ* - to Δ -clearing restarting automata.

Coding We require: Coding by means of Δ -clearing restarting automata. Ability to recover the original word at any time.

Alice and Bob Consider the following game:

Alice and Bob

Protocol We assume: fixed alphabet Σ, fixed length of all initial messages given to Alice. Is there any such protocol? Yes. Basic intuition: Alice adds information by choosing a position of Δ. Alice loses information by deleting one letter.

Coding Idea: Length of | messages | = |Σ|. For Σ = {a, b, c} :

Example – 3-Letter Alphabet Perfect matching for Σ = {a, b, c} : aaa ↔ Δaabaa ↔ bΔacaa ↔ cΔa aab ↔ Δabbab ↔ bΔbcab ↔ caΔ aac ↔ aaΔbac ↔ baΔcac ↔ Δac aba ↔ Δbabba ↔ bbΔcba ↔ cbΔ abb ↔ aΔbbbb ↔ Δbbcbb ↔ cΔb abc ↔ abΔbbc ↔ Δbccbc ↔ cΔc aca ↔ aΔabca ↔ Δcacca ↔ ccΔ acb ↔ acΔbcb ↔ bcΔccb ↔ Δcb acc ↔ aΔcbcc ↔ bΔcccc ↔ Δcc

Coding – Encoding Example Consider word w over Σ = {a, b, c} : w = accbabccacaabbcabcbcacaa. Let us factorize w into groups of |Σ| = 3 letters: w = acc | bab | cca | caa | bbc | abc | bca | caa. We want to encode information i into w : i =

Coding – Encoding Example i = : w = acc | bab | cca | caa | bbc | abc | bca | caa, w' = aΔc | bΔb | cca | caa | Δbc | abc | bca | caa. aaa ↔ Δaabaa ↔ bΔacaa ↔ cΔa aab ↔ Δabbab ↔ bΔbcab ↔ caΔ aac ↔ aaΔbac ↔ baΔcac ↔ Δac aba ↔ Δbabba ↔ bbΔcba ↔ cbΔ abb ↔ aΔbbbb ↔ Δbbcbb ↔ cΔb abc ↔ abΔbbc ↔ Δbccbc ↔ cΔc aca ↔ aΔabca ↔ Δcacca ↔ ccΔ acb ↔ acΔbcb ↔ bcΔccb ↔ Δcb acc ↔ aΔcbcc ↔ bΔcccc ↔ Δcc

Coding – Major Drawback The major drawback: If w does not start with left sentinel ¢ then we cannot factorize w into groups of |Σ| letters. Word w can be factorized as: 1. acc | bab | cca | caa | bbc | abc | bca | caa, 2. ac | cba | bcc | aca | abb | cab | cbc | aca | a, 3. a | ccb | abc | cac | aab | bca | bcb | cac | aa.

Coding – Fixed Points Simple trick: To factorize w we need some “fixed point”. The left sentinel ¢ is one example. In the first phase we distribute fixed points throughout the whole input tape.

Coding – Fixed Points Suppose that we have: w = abaccΔaccbabccacaabbcabcbcacaa. The symbol Δ in w is our fixed point: w = abaccΔ | acc | bab | cca | caa | bbc | abc | bca | caa. Now we can place the next fixed point: w = abaccΔ | acc | bab | cca | caa | bbc | abc | bca | cΔa.

Organization 1. Δ -clearing restarting automata. 2. Δ* -clearing restarting automata. 3. Δ* -clearing restarting automata recognize CFL. 4. Special coding. 5. Reduction: Δ* - to Δ -clearing restarting automata.

Algorithmic Viewpoint Imagine a Δ -clearing restarting automaton as a nondeterministic machine N, which repeatedly executes the following two steps: 1. “Choosing Step”: N chooses a subword w of the input ¢u$, |w| ≤ K. ( K is a fixed constant) 2. “Solving Step”: N runs a computation on w, which either rejects, or replaces a subword of w by λ or Δ. N accepts u iff it can “clear” the whole word u.

Algorithmic Viewpoint Illustration:

Algorithmic Viewpoint To define any Δ -clearing restarting automaton: 1. Define the solving algorithm S, called the solver. 2. Show the existence of a suitable limit K. We put no resource limits on the solving algorithm.

Idea of the Algorithm Consider Δ*cl-RA M whose [generalized] construction was based on a context-free grammar G in ChNF. We want the solving algorithm S imitating M. We do not preserve the original representation of M.

Idea of the Algorithm First, we distribute fixed points throughout the whole input tape in approximately equal distances:

Idea of the Algorithm Suppose that w already contains fixed points. 1. We [internally] translate Δ symbols occurring in w. 2. We find an instruction φ = (x, z → Δ r, y) of the original Δ*cl-RA M applicable inside w. 3. If there is no such instruction, reject.

Idea of the Algorithm Suppose φ = (x, z → Δ r, y) is applicable inside w.

Idea of the Algorithm φ = (x, z → Δ r, y) is applicable inside w Our goal: replace z by Δ r. To avoid conflicts we encode information into z. z contains a long enough empty space v ∊ Σ*. v may be interrupted by fixed points. Space between fixed points is long enough. We choose one such space … working space. We reserve this space … reference point Δ.

Idea of the Algorithm Illustration: φ = (x, z → Δ r, y)

Idea of the Algorithm Working space: φ = (x, z → Δ r, y)

Idea of the Algorithm Cleaning: φ = (x, z → Δ r, y)

Conclusion

References Černo, P., Mráz, F.: Clearing restarting automata. Fundamenta Informaticae 104(1), 17 – 54 (2010) Černo, P., Mráz, F.: Delta-clearing restarting automata and CFL. Tech. rep., Charles University, Faculty of Mathematics and Physics, Prague (2011)