Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.

Slides:



Advertisements
Similar presentations
Introduction to Syntax and Context-Free Grammars Owen Rambow
Advertisements

Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
1 Context Free Grammars Chapter 12 (Much influenced by Owen Rambow) September 2012 Lecture #5.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Introduction to Syntax Owen Rambow September 30.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Introduction to Syntax Owen Rambow September
Introduction to Syntax Owen Rambow October
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
Introduction to Syntax and Context-Free Grammars Owen Rambow September
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Dr. Ansa Hameed Syntax (4).
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Speech and Language Processing Lecture 12—02/24/2015 Susan W. Brown.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
1 Basic Parsing with Context- Free Grammars Slides adapted from Julia Hirschberg and Dan Jurafsky.
Albert Gatt Corpora and Statistical Methods Lecture 11.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing Formal Grammars Chapter 12.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Introduction to Syntax and Context-Free Grammars
Chapter Eight Syntax.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CSC NLP -Context-Free Grammars
Introduction to Syntax and Context-Free Grammars cs
CSCI 5832 Natural Language Processing
Chapter Eight Syntax.
Natural Language - General
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
Introduction to Syntax
David Kauchak CS159 – Spring 2019
Presentation transcript:

Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin

Today Context Free Grammars Sentence types and constituents Recursion and context-freeness Grammar equivalence Normal forms Ambiguity Penn Treebank Some types of syntactic contructions

Context-Free Grammars Defined in formal language theory –Terminals, nonterminals, start symbol, rules –String-rewriting system –Start with start symbol, rewrite using rules, done when only terminals left Not a linguistic theory, just a formal device

CFG Example Many possible CFGs for English, here is an example (fragment): –S  NP VP –VP  V NP –NP  DetP NP | AdjP NP –AdjP  Adj | Adv AdjP –NP  N –N  boy | girl –V  sees | likes –Adj  big | small –Adv  very –DetP  a | the the very small boy likes a girl

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the S S

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the NP VP NP S VP

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the DetP N VP DetP NP S VP N

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy VP boy the DetP NP S VP N

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy likes NP boythelikes DetP NP S VP N V

Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy likes a girl boythelikes DetP NP girl a NP DetP S VP N N V

Derivations of CFGs String rewriting system: we derive a string (=derived structure) But derivation history represented by phrase-structure tree (=derivation structure) Dominance and immediate dominance boythelikes DetP NP girl a NP DetP S VP N N V the boy likes a girl

Formal Definition of a CFG G = (V,T,P,S) V: finite set of nonterminal symbols T: finite set of terminal symbols, V and T are disjoint P: finite set of productions of the form –A  , A  V and   (T  V)* S  V: start symbol

Context in CFGs The notion of context in CFGs has nothing to do with the ordinary meaning of the word context in language All it really means is that the non-terminal on the left- hand side of a rule is out there all by itself (free of context) –A -> B C –I.e., I can rewrite an A as a B followed by a C regardless of the context in which A is found

Key Constituents (English) Sentences Noun phrases Verb phrases Prepositional phrases How are these represented in CFGs?

Sentence Types or Modes Declaratives: I do not. –S -> NP VP Imperatives: Go around again! –S -> VP Yes-No Questions: Do you like my hat? S -> Aux NP VP WH Questions: What are they going to do? –S -> WH Aux NP VP

NPs NP -> Pronoun –I came, you saw it, they conquered NP -> Proper-Noun –New Jersey is west of New York City –Lee Bollinger is the president of Columbia NP -> Det Noun –The president NP -> Nominal Nominal -> Noun Noun –A morning flight to Denver

PPs PP -> Prep(osition) NP –Over the house –Under the house –To the tree –At play –At a party on a boat at night

Recursion We have to deal with rules such as the following, where the non-terminal on the left also appears somewhere on the right –(directly) NP -> NP PP[[The flight] [to Boston]] VP -> VP PP[[departed Miami] [at noon]] –(indirectly) NP -> NP Srel Srel -> NP VP [[the dog] [[the cat] likes]]

Recursion S S[the dog the mouse the cat ate bit bites] S-> NP VP NP[the dog the mouse the cat ate bit] VP[bites] NP-> NP Srel NP[the dog ] Srel[the mouse the cat ate bit] Srel -> NP VP NP[the mouse the cat ate] VP[bit]

NP -> NP Srel NP[the mouse] Srel[the cat ate] Srel -> NP VP NP[the cat] VP[ate] S [NP [NP [the dog ] Srel [NP [NP[the mouse] Srel [NP[the cat] VP[ate]] ] VP[bit]]] VP[bites]]

Recursion [[Flights] [from Denver]] [[[Flights] [from Denver]] [to Miami]] [[[[Flights] [from Denver]] [to Miami]] [in February]] [[[[[Flights] [from Denver]] [to Miami]] [in February]] [on a Friday]] Which rule(s) should we apply? NP –> NP PP NP  NP PPs PPs  PPs PP PPs  PP

Implications of Recursion and Context-Freeness VP -> V NP (I) hate flights from Denver flights from Denver to Miami flights from Denver to Miami in February flights from Denver to Miami in February on a Friday flights from Denver to Miami in February on a Friday under $300 flights from Denver to Miami in February on a Friday under $300 with lunch

This is why context-free grammars are appealing! If you have a rule like VP -> V NP –The rule only cares that the thing after V is an NP –It doesn’t need to know about the internal structure of that NP

Grammar Equivalence Different grammars may generate same set of strings (weak equivalence) –Grammar 1: NP  DetP N and DetP  a | the –Grammar 2: NP  a N | NP  the N Different grammars may generate the same set of strings and assign the same phrase structure to every sentence (strong equivalence) –Grammar 2: NP  a N | NP  the N –Grammar 3: Nom  a N | Nom  the N Strong equivalence implies weak equivalence

Chomsky Normal Form Useful to specify normal forms for grammars A CFG is in Chomsky Normal Form (CNF) if all productions are of one of two forms: –A  BC with A, B, C nonterminals –A  a, with A a nonterminal and a a terminal –No ε transitions Every CFG has a weakly equivalent CFG in CNF

Generative Grammar Formal languages: formal device to generate a set of strings (such as a CFG) Linguistics (Chomskyan linguistics in particular): approach in which a linguistic theory enumerates all possible strings/structures in a language (=competence) Chomskyan theories do not really use formal devices – they use CFG + informally defined transformations –E.g. passive sentences to active structures

Nobody Uses Simple CFGs (Except Intro NLP Courses) All major syntactic theories (Chomsky, LFG, HPSG, TAG-based theories) represent both phrase structure and dependency, in one way or another All successful parsers currently use statistics about phrase structure and about dependency Derive dependency through head percolation: for each rule, say which daughter is head, and specify features on head

Massive Ambiguity of Syntax For a standard sentence, and a grammar with wide coverage, there are 1000s of derivations! Example: –The large portrait painter told the delegation that he sent money orders in a letter on Wednesday –Where do we find the likely sources of syntactic ambiguity?

Penn Treebank (PTB) Syntactically (phrase structure) annotated corpus of news text –The newspaper texts are naturally occurring data, but the PTB is not! –PTB annotation represents a particular linguistic theory (a fairly “vanilla” one) Particularities –Very indirect representation of grammatical relations (need for head percolation tables) –Completely flat structure in NP (brown bag lunch, pink- and-yellow child seat ) –Has flat Ss, flat VPs –Neutral about PP attachment

PTB Example ( (S (NP-SBJ It) (VP 's (NP-PRD (NP (NP the latest investment craze) (VP sweeping (NP Wall Street))) : (NP (NP a rash) (PP of (NP (NP new closed-end country funds), (NP (NP those (ADJP publicly traded) portfolios) (SBAR (WHNP-37 that) (S (NP-SBJ *T*-37) (VP invest (PP-CLR in (NP (NP stocks) (PP of (NP a single foreign country)))))))))))

Not All Verbs are Syntactically the Same Is this the same construction? –An elf decided to clean the kitchen –An elf seemed to clean the kitchen An elf cleaned the kitchen Is this the same construction? –An elf decided to be in the kitchen –An elf seemed to be in the kitchen An elf was in the kitchen

Is this the same construction? There is an elf in the kitchen –*There decided to be an elf in the kitchen –There seemed to be an elf in the kitchen Is this the same construction? It is raining/it rains –??It decided to rain/be raining –It seemed to rain/be raining

Is this the same construction? –An elf decided that he would clean the kitchen –* An elf seemed that he would clean the kitchen An elf cleaned the kitchen

Types of Syntactic Constructions Conclusion: to decide: only full nouns that are referential can appear in upper clause to seem: whatever is embedded surface subject can appear in upper clause Two different types of verbs, represented differently in syntax

Types of Syntactic Constructions: Analysis an elf S NP VP V to decide S NPVP V to be PP in the kitchen S VP V to seem S NPVP V to be PP in the kitchen an elf

Types of Syntactic Constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S VP V seemed S NPVP V to be PP in the kitchen an elf

Types of Syntactic Constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S VP V seemed S NPVP V to be PP in the kitchen an elf

Types of Syntactic Constructions: Analysis an elf S NP VP V decided S NP PRO VP V to be PP in the kitchen S NP i VP V seemed S NPVP V to be PP in the kitchen an elf titi

Control Verbs to decide: a control verb Main argument of control verb is also main argument of non-finite verb Syntactic solution: subject is in upper clause and co- refers with an empty subject in lower clause –an elf decided (an elf to clean the kitchen) –an elf decided (PRO to clean the kitchen) –an elf decided (he cleans/should clean the kitchen) –*it decided (an elf cleans/should clean the kitchen)

Raising Verbs to seem: a raising verb Argument of a lower clause appears syntactically as the subject of seem Syntactic solution: lower surface subject raises to upper clause seems (there to be an elf in the kitchen) –there seems (t to be an elf in the kitchen) –it seems (there is an elf in the kitchen)

Lessons Learned from Raising/Control Issue Use distribution of data to group phenomena into classes Use different underlying structure as basis for explanations Allow things to `move’ around from underlying structure -> transformational grammar Check whether the explanation you give makes predictions

Examples from PTB (S (NP-SBJ-1 The ropes) (VP seem (S (NP-SBJ *-1) (VP to (VP make (NP much sound)))))) (S (NP-SBJ-1 The ancient church vicar) (VP refuses (S (NP-SBJ *-1) (VP to (VP talk (PP-CLR about (NP it)))))

The Big Picture Empirical Matter Formalisms Data structures Formalisms Algorithms Distributional Models Maud expects there to be a riot *Teri promised there to be a riot Maud expects the shit to hit the fan *Teri promised the shit to hit the or Linguistic Theory Content: Relate morphology to semantics Surface representation Deep representation Correspondence uses descriptive theory is about explanatory theory is about predicts

Summary Context Free Grammars a useful (expository) grammar formalism, altho not much used in practice except for very small applications Problems for practical application –Recursion –Ambiguity Grammar equivalence useful to identify Different verb types lead to different syntactic structures Next: Parsing with CFGs