MPS 2016 ELI SHAMIR, HEBREW UNIVERSITY JERUSALEM PARENTAL VIEW OF CONTEXT – FREE BIRTH AND EVOLUTION.

Slides:



Advertisements
Similar presentations
Context-Free and Noncontext-Free Languages
Advertisements

Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Transformation schemes for context-free grammars structural, algorithmic, linguistic applications Eli Shamir Hebrew university of Jerusalem, Israel ISCOL.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
The CYK Algorithm David Rodriguez-Velazquez CS – 6800 Summer I
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture5: Context Free Languages Prof. Amos Israeli.
Applied Computer Science II Chapter 2 : Context-free languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Chap 2 Context-Free Languages. Context-free Grammars is not regular Context-free grammar : eg. G 1 : A  0A1substitution rules A  Bproduction rules B.
Lecture Note of 12/22 jinnjy. Outline Chomsky Normal Form and CYK Algorithm Pumping Lemma for Context-Free Languages Closure Properties of CFL.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2009 with acknowledgement.
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
Normal forms for Context-Free Grammars
Cs466(Prasad)L8Norm1 Normal Forms Chomsky Normal Form Griebach Normal Form.
Nathan Brunelle Department of Computer Science University of Virginia Theory of Computation CS3102 – Spring 2014 A tale.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
Scattered Context Grammars Alexander Meduna Faculty of Information Technology Brno University of Technology Brno, Czech Republic, Europe.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
نظریه زبان ها و ماشین ها فصل دوم Context-Free Languages دانشگاه صنعتی شریف بهار 88.
1 Theory of Computation 計算理論 2 Instructor: 顏嗣鈞 Web: Time: 9:10-12:10 PM, Monday Place: BL 103.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
The CYK Algorithm Presented by Aalapee Patel Tyler Ondracek CS6800 Spring 2014.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Saeid Pashzadeh Jan 2009 Theory of Computation 1.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
1 Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
Context-Free and Noncontext-Free Languages Chapter 13.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
Theory of Languages and Automata By: Mojtaba Khezrian.
CS6800 Advance Theory of Computation Spring 2016 Nasser Alsaedi
Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Theory of Computation. Introduction to The Course Lectures: Room ( Sun. & Tue.: 8 am – 9:30 am) Instructor: Dr. Ayman Srour (Ph.D. in Computer Science).
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
David Rodriguez-Velazquez CS – 6800 Summer I
Context-Free Grammars: an overview
Automata and Languages What do these have in common?
7. Properties of Context-Free Languages
Lecture 22 Pumping Lemma for Context Free Languages
Course 2 Introduction to Formal Languages and Automata Theory (part 2)
Chapter 2: A Simple One Pass Compiler
NORMAL FORMS FDP ON THEORY OF COMPUTING
7. Properties of Context-Free Languages
CHAPTER 2 Context-Free Languages
Context-Free Languages
Finite Automata and Formal Languages
Chapter 2 Context-Free Language - 01
Chapter 2 Context-Free Language - 02
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Sub: Theoretical Foundations of Computer Sciences
COMPILER CONSTRUCTION
Normal Forms for Context-free Grammars
Presentation transcript:

MPS 2016 ELI SHAMIR, HEBREW UNIVERSITY JERUSALEM PARENTAL VIEW OF CONTEXT – FREE BIRTH AND EVOLUTION

Formal Languages Theory- Mid 50’s Confluence of several directions: -Natural Languages [NLP], Syntax Specifications -Early Prog. Languages, Syntax Specifications -Automata & Machine, Formal Specifications -Combinatorial Math. Sets of strings -Biological: L Systems -…

Formal Languages Generative Hierarchy Chomsky+ Recursive enumerable Context-sensitive (LBA) Mildly context- sensitive Context free* Regular = FA definable Subsequently integrated into space/time complexity hierarchy- the backbone of theoretical computer science. * Several sub-models studied, related to compiler constructions for programming languages.

Context-free [CF] central position due to: equivalence of several distinct models Algebraic equations [MPS] DUAL APPROACH IN ARGGEMENT AND PROOFS Production rules and trees BNF- Backus NF, Syntax of early prog. languages Categorical grammars Dependency structures Lambek algebraic calculus Pushdown Automata… Rich algebraic, combinatorial, algorithmic properties and problems, significant applications.

: Boston- Jerusalem Correspondence Linguists : MIT N. Chomsky Y. Bar Hillel HUJI Mathem: Harvard MPS (MARCO) H. Gaifman, M. Perles, E. Shamir Paris [Math PhD students] Main articles, monographs mainly on CF [listed next: 2-5, 19]. Up to 1969, Many other researches and groups in USA, Europe, Japan joined. See publication lists [next few slides]. Inclusion as a basic topic in CS education.

Central Publications up to J. Hopcroft and J. Ullman, Formal Languages and their relations to Automata, Assidon-Wesley, [Extensive reference list] 2.Y. Bar-Hillel, H. Gaifman and E. Shamir, On categorical and phrase structure grammars. Bulletin research council of Israel, vol. 9f (1960), Y. Bar-Hillel, M. Perles and E. Shamir, On formal properties of simple phrase, structure grammars, Z. Phonetik, Sprachwiss. Kommun., 14 (1961), & 3 reproduced in Y. Bar-Hillel, Language and information, Assidon-Wesley, appeared as a monograph in Russian, N. Chomsky, On certain formal properties of grammars, Inf. and Control, 2:2 (1959), N. Chomsky and M. P. Schutzenberger, The algebraic theory of context-free languages, Computer Programming and Formal Systems, North Holland, [Appeared as a monograph] 6.J. Evey, The theory and application of pushdown store machines, Doctoral Thesis, Harvard University, R. W. Floyd, The syntax of programming languages- a survey, Professional Group Electronic Computers [PGEC], 13: 4 (1964),

9.S. Ginsburg, and H. G. Rice, Two families of languages related to ALGOL, JACM, 9: 3, , S. Ginsburg, The mathematical theory of context-free languages, S. Greibach, A new normal form theorem for context-free grammars, JACM, 12:1, 42-52, D. E. Knuth, a characterization of parenthesis languages, Inf. and Control, 11: 3, , P. S. Landweber, Three theorems on phrase structure grammars of type 1, Inf. and Control, 6:2, , M. Nivat, Transduction des langages de Chomsky, PhD Thesis. Univ. de Paris, [Also in Annales de l’Institut Fourier, 18: , 1968]. 15.R. J. Parikh, On context-free languages, JACM, 13, , D. J. Rosenkrantz, Matrix equations and normal forms for context-free grammars, JACM, 14:3, , J. Rhodes and E. Shamir, Complexity of grammars by group- theoretic methods, Journal of Combinatorial Theory, , E. Shamir, A representation theorem for algebraic and context-free power series in noncommuting variables, Inf. and Control, 11, , M. P. Schutzenberger [Several articles: ] 20.D. H. Younger, Recognition and parsing of context-free languages in time n, Inf. and Control, 10: 2, ,

Chosen Books & Publications After J. Autebert, J. Berstel and L. Boasson, Context-free language and pushdown automata. Chap. 3 In: handbook of formal languages Vol 1. G. Rozenberg and A. Salomaa (eds.), Springer-Verlag [Extensive reference list] 2.M. Droste, W. Kuich, H. Vogler (Eds.), Handbook of Weighted Automata, Springer S. Greibach. The hardest context-free language. SIAM J. on computing 3 (1973), M. Harrison, Introduction to Formal Language Theory, Addison- Wesley, L. Kallmeyer, Parsing Beyond Context Free Grammars, Springer, E. Shamir, Some inherently ambiguous context-free languages. Inf. and Control 18 (1971). 7.J. Berstel, Transductions and context-free languages, Teubner Verlag, A. Salomaa, Formal Languages, Academic Press, J. Sakarovitch, Pushdown automata with terminal languages, 421 in Publication RIMS, Kyoto University, 1981, pp S. Eilenberg, Automata, Languages and Machines, Vol. A & B, Academic Press, G. Rozenberg and A. Salomaa. The mathematical theory of L systems, Springer P. Flojolet, Analytic models and ambiguity of context free languages Theor. Comp. Sci 49,

Hindsight of Central CF Results Chomsky- Schutzenberger Theorems: and their impact Each CFL L= h (DykeᴖR) Dyke= {well bracketed strings}, R= regular language A non-ambiguous L has an algebraic generating function (Sh 1967): Each CFL maps into Non-deter. lifting of 1 sided Dyke hence it is A universal CFL thus a “hardest CFL”. map a  φ(a)= […+…+], φ(a 1 a 2 … a n )= φ(a 1 )… φ(a n )= =[…+…+] […+…+]… […+…+] (multinom product) wϵL(G) iff opening multinom product gives a term in DYKE. (BGS 1960 ): Non-deter. lifting of CAT is also universal (hardest) CFL

DYKE-j: All well-bracketed strings with j pairs. CAT: Well-cancelled categories-strings. a  a / b b, a / b  a / b / c c, a  a / b / c b/ c they are determ. CFLs, their non-det. liftings are “Hardest CFL. Algebraic path: Gauss elim-> Greib.NF->SH. Thm. & Pushdown Automat. Derivation path: triplets (p, A, q) [in BPS 1960] -> Pushdown Autom -> Greib. Normal Form and SH. Thm Algorithm and Complexity : impact of the non-decidability results (BPS 1960). Membership and parsing – tabular dynamic prog. algorithms (CYK, Earley,…). Time complexity reduced to multip. of Boolean matrices (L. Valiant, L. Lee). (Hindsight (continued

Ambiguity- Complex Issues In (Linear)CFG, in Transductions, in Algeb Equations Inherent ambiguity proofs using pumping in D - trees and by generating function method (Ph. Flajolet) Effect of Transformations on ambiguity Effects on Parsing of product ambiguity degree Inherently 1 or infinite? Open question Eilenberg problem: decomposition of bounded degree language to union of 1 degree languages - open

Ambiguity in NLP: Ambiguity in natural languages can be resolved (or created) by cyclic rotation of the sentence: Bible Book of Job chapter 6 verse 14 (six Hebrew words). Translated : "a friend should extend # mercy to the sufferer $, even if he abandons God's fear." Anaphoric ambiguity: the pronoun "he" refers to the sufferer or to the friend? A poetic beautiful answer: to Both. Cyclic rotated sentences, starting at the symbols # and $, resolve the ambiguity towards one way or the other. Political loaded example: the policeman shot # the boy $ with the gun.

SRT: SPREAD - ROTATE TRANSFORMATION Of a grammar G, its trees and derived strings internal nodes labelled by prodacts of grammars: SRT TREE root label = #G, leaves labels = H(i) – linear grammars Thm ( invariance claim) 1-1 onto U {D – trees of H(i)} D - trees of #G mapped Mod. Cyclic rotations (of trees and derives strings) But Works perfect for non – expansive CF grammars (quasi-rational) but also for mild context – sensitive with CF skeleton (E.G.LIG grammars) SRT: enhance parsing alg, property tests, and applications cosmetics of the CFG model to enhance its NLP adequacy: *Avoid expansive pumping B BB BUT ADD GENER. POWER BY LOCAL STACKS (AS IN INDEXED GRAMMARS)

Top Trunk Rotation of MN to (M*N^) M M EXIT N^ x1x1 x2x2 y1y1 y2y2 x1x1 x2x2 y2y2 y1y1 N derived strings: m x 1 x 2 … n^ …y 2 y 1 …y 2 y 1 m x 1 x 2 … n^ for trees: M* 180 Cyclic rotation of

N grammar (top trunk) M* grammar B  B’C B’  CB B  DB’ B’  BD B  B^, B^  α B^= root(M) All productions not involving [B] carry over from N to M*; those of M unchanged. Note: Since M may contain symbols of [B] duplicate symbols [B] needed only for the new top trunk of M* The TTR rotation is invertible, one-one onto for the derivation trees, preserving weights and ambiguity degree in ‘cyclic rotated’ sense. SRT For grammars:

Example (from [Sh., 1971]) (M)(N) = (u $J u ) (v J$ v), u, v ε {0.1}* = J u = reversal of u, It has unbounded "direct (product) ambiguity" which increases time in CYK algorithm to n In one TTR step (see below) MN is rotated to (M*)(N^) = (v u $ J u v ) (J $), which has a linear grammar, with 3 pump classes. All (product ambiguity) trees are rotated to (union ambiguity) trees for M*N^. Each derived terminal string is CYC-rotated as well! R R R R R 3

MILD Context-Sensitive Models & SRT Many models proposed incl. 4 equivalent ones: Linear-Index [LIG], Tree-Adjoint [TAG],…. Should satisfy some formal requirements: Proper extension of CFG, Poly-time parsing algor… We define NE-LIG as follows: Has NE-CFG skeleton aux. symbols A, B,… Each pump-class [B] maintains stack (pushdown) index, stack empty at enter & Exit of several consecutive pump blocks- THUS, it can, with skeleton -symbols as “states”, simulate any PDM, any CFG. The form of production rules is: B[index] C B’[index’], Bˆ[ ]  D[ ] E [ ] Push Pop

Glossary CFG/L- Context Free Grammars/Language LIG- Linear Indexed Grammar TAG- Tree Adjoining Grammar NLP- Natural Language Processing CYK- Cocke, Younger, Kasami CNF- Chomsky Normal Form GNF- Greibach Normal Form SRT- Spread Rotate Tree D-Tree- Derivation Tree EPOS- Epoch Semi-Order TTR- Top Trunk Rotation DP- Dynamic Programming NE- Non Expansive POS- Parts of Speech PDM- Pushdown Machine NT- Non terminals (symbols)