1 Polynomial Time Probabilistic Learning of a Subclass of Linear Languages with Queries Yasuhiro TAJIMA, Yoshiyuki KOTANI Tokyo Univ. of Agri. & Tech.

Slides:



Advertisements
Similar presentations
An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution Jeffrey C. Jackson Presented By: Eitan Yaakobi Tamar.
Advertisements

Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
The Power of Correction Queries Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005,
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Yi Wu (CMU) Joint work with Vitaly Feldman (IBM) Venkat Guruswami (CMU) Prasad Ragvenhdra (MSR)
CS5371 Theory of Computation
Probably Approximately Correct Learning Yongsub Lim Applied Algorithm Laboratory KAIST.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
Probably Approximately Correct Model (PAC)
Vapnik-Chervonenkis Dimension
Exact Learning of Boolean Functions with Queries Lisa Hellerstein Polytechnic University Brooklyn, NY AMS Short Course on Statistical Learning Theory,
Vapnik-Chervonenkis Dimension Part II: Lower and Upper bounds.
Normal forms for Context-Free Grammars
Context-free Grammars
Automating Grammar Comparison by Ravichandhran Madhavan, EPFL Mikael Mayër, EPFL Sumit Gulwani, MSR Viktor Kuncak, EPFL.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
The Simplified Partial Digest Problem: Hardness and a Probabilistic Analysis Zo ë Abrams Ho-Lin Chen
PAC learning Invented by L.Valiant in 1984 L.G.ValiantA theory of the learnable, Communications of the ACM, 1984, vol 27, 11, pp
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Incremental Inference of Black Box Components to support Integration Testing [PhD paper] Muzammil Shahbaz France Telecom R&D Grenoble Institute of Technology.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
1 Learning languages from bounded resources: the case of the DFA and the balls of strings ICGI -Saint Malo September 2008 de la Higuera, Janodet and Tantini.
Learning DFA from corrections Leonor Becerra-Bonache, Cristina Bibire, Adrian Horia Dediu Research Group on Mathematical Linguistics, Rovira i Virgili.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
TM Design Universal TM MA/CSSE 474 Theory of Computation.
1 Grammatical inference Vs Grammar induction London June 2007 Colin de la Higuera.
1 Context-Free Languages Not all languages are regular. L 1 = {a n b n | n  0} is not regular. L 2 = {(), (()), ((())),...} is not regular.  some properties.
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
Computation Model and Complexity Class. 2 An algorithmic process that uses the result of a random draw to make an approximated decision has the ability.
1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Slide Slide 1 Section 6-5 The Central Limit Theorem.
Section 6-5 The Central Limit Theorem. THE CENTRAL LIMIT THEOREM Given: 1.The random variable x has a distribution (which may or may not be normal) with.
Learning bounded unions of Noetherian closed set systems via characteristic sets Yuichi Kameda 1, Hiroo Tokunaga 1 and Akihiro Yamamoto 2 1 Tokyo Metropolitan.
Resource bounded dimension and learning Elvira Mayordomo, U. Zaragoza CIRM, 2009 Joint work with Ricard Gavaldà, María López-Valdés, and Vinodchandran.
1 Approximate Schemas and Data Exchange Michel de Rougemont University Paris II & LRI Joint work with Adrien Vielleribière, University Paris-South.
Zadar, August Active Learning 2010 Colin de la Higuera.
Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.
Machine Learning Chapter 5. Evaluating Hypotheses
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University Today: Computational Learning Theory Probably Approximately.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Probabilistic Automaton Ashish Srivastava Harshil Pathak.
Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University.
Chap2. Language Acquisition: The Problem of Inductive Inference (2.1 ~ 2.2) Min Su Lee The Computational Nature of Language Learning and Evolution.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Representing Languages by Learnable Rewriting Systems Rémi Eyraud Colin de la Higuera Jean-Christophe Janodet.
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Learning regular tree languages from correction and equivalence queries Cătălin Ionuţ Tîrnăucă Research Group on Mathematical Linguistics, Rovira i Virgili.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
Computational Learning Theory
Computational Learning Theory
Evaluating Hypotheses
Computational Learning Theory
Machine Learning: UNIT-3 CHAPTER-2
Grammatical inference: learning models
Presentation transcript:

1 Polynomial Time Probabilistic Learning of a Subclass of Linear Languages with Queries Yasuhiro TAJIMA, Yoshiyuki KOTANI Tokyo Univ. of Agri. & Tech.

2 This talk… Probabilistic learning algorithm of a subclass of linear languages with membership queries learning via queries + special examples → Probabilistic learning Use translation algorithms representative sample → random examples equivalence query → random examples

3 Motivations A simple deterministic grammar (SDG) has at most one rule for every pair of ⇒ learning algorithm for SDG from membership queries representative sample ⇒ for linear languages CFLs SDLs Regular Linear (Tajima et al. 2004)

4 Linear grammar A context-free grammar is a linear grammar if every rule is of the form : nonterminal : terminal Any linear grammar can be written in RL-linear s.t. if every rule is of the form and

5 has only left linear rules ( or right linear rules) Strict-deterministic linear grammar An RL-linear is a Strict-det linear if, for any pair of rules or Ex) for some a,B,c,D

6 Deterministic linear grammar A linear grammar is deterministic linear (DL) if every rule is of the form or and Theorem : Theorem(de la Higuera, Oncina 2002) : DL : identifiable in the limit from polynomial time and data

7 MAT learning (Angluin1987) learner hypothesis teacher target language membership query counter example yes or no hypothesis equivalence query

8 PAC learning (Valiant 1984) PAC : Probabilistic Approximate Correct : probability distribution target concept example learning algorithm hypothesis is PAC

9 If a hypothesis is consistent with Equivalence query ⇒ PAC learning algorithm (Angluin[1987]) × × If there is a consistent hypothesis ⇒ PAC learnable examples consistent with examples

10 Probabilistic learning with queries Learning algorithm Example oracle target language Membership query Yes or No hypothesis

11 Representative sample for a Strict-det : Strict-det : representative sample (RS) for some All rules are used to generate Q

12 Example : then is a representative sample (RS)

13 Rule occurring probability : a target grammar : a probability distribution on for an example : error parameter : confidential parameter : the size of target grammar’s rules For every rule, define

14 is a rule occurring probability s.t. appears in the derivation of an example is an probability that and is used in the derivation

15 Let Suppose The set of m-examples contains a set of RS with the probability Proof: “Any rule doesn’t appear in derivations of m-examples” occurs RS

16 We can conclude that 1.Equivalence query can be replaced by random examples 2.Representative sample can be replaced by random examples

17 example oracle membership oracle learning algorithm membership query equivalence query representative sample query response nega m-random examples posi n-random examples probabilistic learning algorithm with queries consistency check

18 Learning algorithm via queries and RS while (finish == 0) begin make nonterminals from make rules and hypothesis if (equivalence query for responds “yes”) output, finish = 1 else update by the counterexample end

19 Making nonterminals then : a nonterminal = an equivalence class contains (u,v,w)

20 Making rules Make all rules as follows except for not consistent with query results Select a hypothesis randomly

21 a set of Strict-det (not bounded by a polynomial) SD Exact learning of strict-det Strict-det is polynomial time exact learnable via –membership queries, and –a representative samples (RS) c.f. [Angluin(1980)] for regular sets RS Possible rules The learning algorithm overview: SD Chose one randomly, Equivalence query SD The correct hypothesis Witnesses delete incorrect rule

22 Conclusions Strict-det linear language can be probabilistic learnable with queries in polynomial time Future works Identification from polynomial time and data (teachability) RS → Correction queries

23

24 Theorem Strict-det linear languages are polynomial time probabilistic learnable with membership queries

25 Simple Deterministic Languages Context-free grammar(CFG) in 2-standard Greibach normal form is Simple Deterministic Grammar (SDG) iff is unique for every and Simple Deterministic Language (SDL) is the generated language by a SDG

26 Representative sample for an SDG : SDG : representative sample (RS) for some All rules are used to generate Q

27 Example : then is a representative sample (RS)

28 PAC learning Target language : Hypothesis language : A PAC learning algorithm outputs such that where Probability distribution : on (Valiant1984)

29 Query learning of SDLs SDLs are polynomial time learnable via membership queries and a representative sample the learner the teacher membership query yes / no representative sample at the beginning representative sample : a special finite subset of (Tajima2000)

30 Learning model the learnerthe teacher membership query yes / no representative sample at the beginning representative sample : a special finite subset of