Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.

Slides:



Advertisements
Similar presentations
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Advertisements

Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
1 LING 180 Autumn 2007 LINGUIST 180: Introduction to Computational Linguistics Dan Jurafsky, Marie-Catherine de Marneffe Lecture 9: Grammar and Parsing.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Grammar and Parsing.
Features and Unification
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
CPSC 503 Computational Linguistics
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
November 2009HLT - Sentence Grammar1 Human Language Technology Sentence Grammar.
Speech and Language Processing Lecture 12—02/24/2015 Susan W. Brown.
Session 13 Context-Free Grammars and Language Syntax Introduction to Speech and Natural Language Processing (KOM422 ) Credits: 3(3-0)
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Syntax Sudeshna Sarkar 25 Aug Sentence-Types Declaratives: A plane left S -> NP VP Imperatives: Leave! S -> VP Yes-No Questions: Did the plane.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
10/3/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
6/2/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
Section 11.3 Features structures in the Grammar ─ Jin Wang.
CPSC 503 Computational Linguistics
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
Speech and Language Processing Formal Grammars Chapter 12.
SI485i : NLP Set 7 Syntax and Parsing. 2 Syntax Grammar, or syntax: The kind of implicit knowledge of your native language that you had mastered by the.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
CSC 594 Topics in AI – Natural Language Processing
Heng Ji September 13, 2016 SYNATCTIC PARSING Heng Ji September 13, 2016.
Speech and Language Processing
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
CSC NLP -Context-Free Grammars
Lecture 13: Grammar and Parsing (I) November 9, 2004 Dan Jurafsky
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Natural Language - General
CSCI 5832 Natural Language Processing
Heng Ji January 17, 2019 SYNATCTIC PARSING Heng Ji January 17, 2019.
David Kauchak CS159 – Spring 2019
Presentation transcript:

Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea

Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless green ideas sleep furiously the kind of implicit knowledge of your native language that you had mastered by the time you were 3 or 4 years old without explicit instruction not necessarily the type of rules you were later taught in school.

Slide 1 Syntax Why should you care? Grammar checkers Question answering Information extraction Machine translation

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 4 Constituency The basic idea here is that groups of words within utterances can be shown to act as single units. And in a given language, these units form coherent classes that can be be shown to behave in similar ways

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 5 Constituency For example, it makes sense to the say that the following are all noun phrases in English... The person I ate dinner with yesterday The car that I drove in college

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 6 Grammars and Constituency However, it isn’t easy or obvious how we come up with the right set of constituents and the rules that govern how they combine... That’s why there are so many different theories of grammar and competing analyses of the same data. The approach to grammar, and the analyses, adopted here are very generic (and don’t correspond to any modern linguistic theory of grammar).

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 7 Context-Free Grammars Context-free grammars (CFGs) Also known as Phrase structure grammars Backus-Naur form Consist of Rules Terminals Non-terminals

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 8 Context-Free Grammars Terminals We’ll take these to be words Non-Terminals The constituents in a language Like noun phrase, verb phrase and sentence Rules Rules are equations that consist of a single non-terminal on the left and any number of terminals and non-terminals on the right.

Slide 1 CFG Example S -> NP VP NP -> Det NOMINAL NOMINAL -> Noun VP -> Verb Det -> a Noun -> flight Verb -> left these rules are defined independent of the context where they might occur -> CFG 9

Slide 1 CFGs S -> NP VP This says that there are units called S, NP, and VP in this language That an S consists of an NP followed immediately by a VP Doesn’t say that that’s the only kind of S Nor does it say that this is the only place that NPs and VPs occur Generativity You can view these rules as either analysis or synthesis machines Generate strings in the language Reject strings not in the language Impose structures (trees) on strings in the language 10

Slide 1 Parsing Parsing is the process of taking a string and a grammar and returning a (many) parse tree(s) for that string Other options Regular languages (expressions) Too weak – not expressive enough Context-sensitive Too powerful – parsing is not efficient 11

Slide 1 Context? The notion of context in CFGs is not the same as the ordinary meaning of the word context in language. All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself A -> B C Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found 12

Slide 1 Key Constituents (English) Sentences Noun phrases Verb phrases Prepositional phrases 13

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 14 Some NP Rules Here are some rules for our noun phrases Together, these describe two kinds of NPs. One that consists of a determiner followed by a nominal And another that says that proper names are NPs. The third rule illustrates two things An explicit disjunction Two kinds of nominals A recursive definition Same non-terminal on the right and left-side of the rule

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 15 Grammar

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 16 Derivations A derivation is a sequence of rules applied to a string that accounts for that string Covers all the elements in the string Covers only the elements in the string

Slide 1 Practice: Another Recursion Example The following where the non-terminal on the left also appears somewhere on the right (directly). NP -> NP PP[[The flight] [to Boston]] VP -> VP PP[[departed Miami] [at noon]] 17

Slide 1 Recursion Example flights from Denver Flights from Denver to Miami Flights from Denver to Miami in February Flights from Denver to Miami in February on a Friday Flights from Denver to Miami in February on a Friday under $300 Flights from Denver to Miami in February on a Friday under $300 with lunch 18

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 19 Sentence Types Declaratives: A plane left. S  NP VP Imperatives: Leave! S  VP Yes-No Questions: Did the plane leave? S  Aux NP VP WH Questions: When did the plane leave? S  WH-NP Aux NP VP

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 20 Noun Phrases Let’s consider the following rule in more detail... NP  Det Nominal Consider the derivation for the following example All the morning flights from Denver to Tampa leaving before 10

Slide 1 21 Noun Phrases

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 22 NP Structure Clearly this NP is really about flights. That’s the central criticial noun in this NP. It is the head. We can dissect this kind of NP into the stuff that can come before the head, and the stuff that can come after it.

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 23 Determiners Noun phrases can start with determiners... Determiners can be Simple lexical items: the, this, a, an, etc. A car Or simple possessives John’s car Or complex recursive versions of that John’s sister’s husband’s son’s car

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 24 Nominals Contains the head and any pre- and post- modifiers of the head. Pre- Quantifiers, cardinals, ordinals... Three cars Adjectives large cars Note: there are ordering constraints Three large cars ?large three cars

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 25 Postmodifiers Three kinds Prepositional phrases From Seattle Non-finite clauses Arriving before noon Relative clauses That serve breakfast Recursive rules to handle these Nominal  Nominal PP Nominal  Nominal GerundVP Nominal  Nominal RelClause

Slide 1 Agreement This dog Those dogs This dog eats Those dogs eat *This dogs *Those dog *This dog eat *Those dogs eats 26

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 27 Verb Phrases English VPs consist of a head verb along with 0 or more following constituents which we’ll call arguments.

Slide 1 Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S … *John sneezed the book *I prefer United has a flight *Give with a flight Subcat expresses the constraints that a predicate places on the number and type of the argument it wants to take 28

Slide 1 Overgeneration The various rules for VPs overgenerate. They permit the presence of strings containing verbs and arguments that don’t go together For example VP -> V NP therefore Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP In lecture: go over the grammar for assignment 3

Slide 1 9/9/ Possible CFG Solution Possible solution for agreement. Can use the same trick for all the verb/VP classes. (Like propositionalizing a first- order knowledge base – the KB gets very large, but the inference algorithms are very efficient) SgS -> SgNP SgVP PlS -> PlNp PlVP SgNP -> SgDet SgNom PlNP -> PlDet PlNom PlVP -> PlV NP SgVP ->SgV Np …

Slide 1 Movement Core example (no movement yet) – [[My travel agent] NP [booked [the flight] NP ] VP ] S I.e. “book” is a straightforward transitive verb. It expects a single NP arg within the VP as an argument, and a single NP arg as the subject.

Slide 1 Movement What about? – Which flight do you want me to have the travel agent book? The direct object argument to “book” isn’t appearing in the right place. It is in fact a long way from where its supposed to appear. And note that it’s separated from its verb by 2 other verbs. 32

Slide 1 Formally… To put all previous discussions/examples in a formal definition for CFG: A context free grammar has four parameters: 1. A set of non-terminal symbols N 2. A set of terminal symbols T 3. A set of production rules P, each of the form A  a, where A is a non-terminal, and a is a string of symbols from the infinite set of strings (T  N)* 4. A designated start symbol S

Slide 1 Grammar equivalence and normal form Strong equivalence: – two grammars are strongly equivalent if: they generate/accept the same set of strings they assign the same phrase structure to each sentence – two grammars are weakly equivalent if: they generate/accept the same set of strings they do not assign the same phrase structure to each sentence Normal form – Restrict the form of productions – Chomsky Normal Form (CNF) – Right hand side of the productions has either one or two terminals or non- terminals – e.g. A -> BC A -> a – Any grammar can be translated into a weakly equivalent CNF – A -> B C D A-> B X X -> C D 34

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 35 Treebanks Treebanks are corpora in which each sentence has been paired with a parse tree (presumably the right one). These are generally created By first parsing the collection with an automatic parser And then having human annotators correct each parse as necessary. This generally requires detailed annotation guidelines that provide a POS tagset, a grammar and instructions for how to deal with particular grammatical constructions.

Slide 1 9/9/ Penn Treebank Penn TreeBank is a widely used treebank.  Most well known is the Wall Street Journal section of the Penn TreeBank.  1 M words from the Wall Street Journal.

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 37 Treebank Grammars Treebanks implicitly define a grammar for the language covered in the treebank. Simply take the local rules that make up the sub-trees in all the trees in the collection and you have a grammar. Not complete, but if you have decent size corpus, you’ll have a grammar with decent coverage.

Slide 1 9/9/ Treebank Grammars Such grammars tend to be very flat due to the fact that they tend to avoid recursion. For example, the Penn Treebank has 4500 different rules for VPs. Among them...

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 39 Heads in Trees Finding heads in treebank trees is a task that arises frequently in many applications. Particularly important in statistical parsing We can visualize this task by annotating the nodes of a parse tree with the heads of each corresponding node.

Slide 1 9/9/ Lexically Decorated Tree

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 41 Head Finding The standard way to do head finding is to use a simple set of tree traversal rules specific to each non-terminal in the grammar.

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 42 Noun Phrases

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin43 Treebank Uses Treebanks (and headfinding) are particularly critical to the development of statistical parsers Chapter 14 Also valuable to Corpus Linguistics Investigating the empirical details of various constructions in a given language

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin44 Dependency Grammars In CFG-style phrase-structure grammars the main focus is on constituents. But it turns out you can get a lot done with just binary relations among the words in an utterance. In a dependency grammar framework, a parse is a tree where the nodes stand for the words in an utterance The links between the words represent dependency relations between pairs of words. Relations may be typed (labeled), or not.

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 45 Dependency Relations

Slide 1 9/9/ Dependency Parse They hid the letter on the shelf See the Stanford parser on line

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 47 Dependency Parsing The dependency approach has a number of advantages over full phrase- structure parsing. Deals well with free word order languages where the constituent structure is quite fluid Parsing is much faster than CFG-bases parsers Dependency structure often captures the syntactic relations needed by later applications CFG-based approaches often extract this same information from trees anyway.

Slide 1 9/9/2015 Speech and Language Processing - Jurafsky and Martin 48 Dependency Parsing There are two modern approaches to dependency parsing Optimization-based approaches that search a space of trees for the tree that best matches some criteria Shift-reduce approaches that greedily take actions based on the current word and state.

Slide 1 9/9/ Summary Context-free grammars can be used to model various facts about the syntax of a language. When paired with parsers, such grammars consititute a critical component in many applications. Constituency is a key phenomena easily captured with CFG rules. But agreement and subcategorization do pose significant problems Treebanks pair sentences in corpus with their corresponding trees.