PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.

Slides:



Advertisements
Similar presentations
Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Advertisements

Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור תשע Bottom Up Parsing עידו דגן.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Language acquisition
CFGS – TAKE 2 David Kauchak CS30 – Spring Admin Today’s mentor hours moved to 6-8pm Assignment 4 graded Assignment 5 - how’s it going? - part A.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
SI485i : NLP Set 8 PCFGs and the CKY Algorithm. PCFGs We saw how CFGs can model English (sort of) Probabilistic CFGs put weights on the production rules.
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
11 Syntactic Parsing. Produce the correct syntactic parse tree for a sentence.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Language acquisition
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CSA2050 Introduction to Computational Linguistics Parsing I.
NLP. Introduction to NLP Background –Developed by Jay Earley in 1970 –No need to convert the grammar to CNF –Left to right Complexity –Faster than O(n.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing SLP Chapter 13 Parsing.
1 Statistical methods in NLP Diana Trandabat
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Basic Parsing with Context Free Grammars Chapter 13
CS : Speech, NLP and the Web/Topics in AI
CS 388: Natural Language Processing: Syntactic Parsing
Parsing and More Parsing
Language acquisition (4:30)
David Kauchak CS159 – Spring 2019
David Kauchak CS159 – Spring 2019
Presentation transcript:

PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney

Admin  Updated slides/examples on backoff with absolute discounting (I’ll review them again here today)  Assignment 2  Watson vs. Humans (tonight-Wednesday)

Backoff models: absolute discounting  Subtract some absolute number from each of the counts (e.g. 0.75)  will have a large effect on low counts  will have a small effect on large counts

Backoff models: absolute discounting What is α (xy)?

Backoff models: absolute discounting see the dog1 see the cat2 see the banana4 see the man1 see the woman1 see the car1 the Dow Jones10 the Dow rose5 the Dow fell5 p( cat | see the ) = ? p( puppy | see the ) = ? p( rose | the Dow ) = ? p( jumped | the Dow ) = ?

Backoff models: absolute discounting see the dog1 see the cat2 see the banana4 see the man1 see the woman1 see the car1 p( cat | see the ) = ?

Backoff models: absolute discounting see the dog1 see the cat2 see the banana4 see the man1 see the woman1 see the car1 p( puppy | see the ) = ? α (see the) = ? How much probability mass did we reserve/discount for the bigram model?

Backoff models: absolute discounting see the dog1 see the cat2 see the banana4 see the man1 see the woman1 see the car1 p( puppy | see the ) = ? α (see the) = ? # of types starting with “see the” * D count(“see the”) For each of the unique trigrams, we subtracted D/count(“see the”) from the probability distribution

Backoff models: absolute discounting see the dog1 see the cat2 see the banana4 see the man1 see the woman1 see the car1 p( puppy | see the ) = ? α (see the) = ? distribute this probability mass to all bigrams that we backed off to # of types starting with “see the” * D count(“see the”)

Calculating α  We have some number of bigrams we’re going to backoff to, i.e. those X where C(see the X) = 0, that is unseen trigrams starting with “see the”  When we backoff, for each of these, we’ll be including their probability in the model: P(X | the)  α is the normalizing constant so that the sum of these probabilities equals the reserved probability mass

Calculating α  We can calculate α two ways  Based on those we haven’t seen:  Or, more often, based on those we do see:

Calculating α in general: trigrams  Calculate the reserved mass  Calculate the sum of the backed off probability. For bigram “A B”:  Calculate α reserved_mass(bigram) = # of types starting with bigram * D count(bigram) either is fine in practice, the left is easier 1 – the sum of the bigram probabilities of those trigrams that we saw starting with bigram A B

Calculating α in general: bigrams  Calculate the reserved mass  Calculate the sum of the backed off probability. For bigram “A B”:  Calculate α reserved_mass(unigram) = # of types starting with unigram * D count(unigram) either is fine in practice, the left is easier 1 – the sum of the unigram probabilities of those bigrams that we saw starting with word A

Calculating backoff models in practice  Store the α s in another table  If it’s a trigram backed off to a bigram, it’s a table keyed by the bigrams  If it’s a bigram backed off to a unigram, it’s a table keyed by the unigrams  Compute the α s during training  After calculating all of the probabilities of seen unigrams/bigrams/trigrams  Go back through and calculate the α s (you should have all of the information you need)  During testing, it should then be easy to apply the backoff model with the α s pre-calculated

Backoff models: absolute discounting  Two nice attributes:  decreases if we’ve seen more bigrams should be more confident that the unseen trigram is no good  increases if the bigram tends to be followed by lots of other words will be more likely to see an unseen trigram reserved_mass = # of types starting with bigram * D count(bigram)

Syntactic structure The man in the hat ran to the park. DT NN IN DT NN VBD IN DT NN NP PP NP PP VP S (S (NP (NP (DT the) (NN man)) (PP (IN in) (NP (DT the) (NN hat)))) (VP (VBD ran) (PP (TO to (NP (DT the) (NN park))))))

CFG: Example  Many possible CFGs for English, here is an example (fragment):  S  NP VP  VP  V NP  NP  DetP N | AdjP NP  AdjP  Adj | Adv AdjP  N  boy | girl  V  sees | likes  Adj  big | small  Adv  very  DetP  a | the

Grammar questions  Can we determine if a sentence is grammatical?  Given a sentence, can we determine the syntactic structure?  Can we determine how likely a sentence is to be grammatical? to be an English sentence?  Can we generate candidate, grammatical sentences?

Parsing  Parsing is the field of NLP interested in automatically determining the syntactic structure of a sentence  parsing can also be thought of as determining what sentences are “valid” English sentences

Parsing  Given a CFG and a sentence, determine the possible parse tree(s) S -> NP VP NP -> PRP NP -> N PP NP -> N VP -> V NP VP -> V NP PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with I eat sushi with tuna What parse trees are possible for this sentence? How did you figure it out?

Parsing I eat sushi with tuna PRP NP VNINN PP NP VP S I eat sushi with tuna PRP NP VNINN PP NP VP S S -> NP VP NP -> PRP NP -> N PP VP -> V NP VP -> V NP PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with What is the difference between these parses?

Parsing  Given a CFG and a sentence, determine the possible parse tree(s) S -> NP VP NP -> PRP NP -> N PP VP -> V NP VP -> V NP PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with I eat sushi with tuna approaches? algorithms?

Parsing  Top-down parsing  start at the top (usually S) and apply rules  matching left-hand sides and replacing with right-hand sides  Bottom-up parsing  start at the bottom (i.e. words) and build the parse tree up from there  matching right-hand sides and replacing with left-hand sides

Parsing Example S VP Verb NP book Det Nominal that Noun flight book that flight

Top Down Parsing S NP VP Pronoun

Top Down Parsing S NP VP Pronoun book X

Top Down Parsing S NP VP ProperNoun

Top Down Parsing S NP VP ProperNoun book X

Top Down Parsing S NP VP Det Nominal

Top Down Parsing S NP VP Det Nominal book X

Top Down Parsing S Aux NP VP

Top Down Parsing S Aux NP VP book X

Top Down Parsing S VP

Top Down Parsing S VP Verb

Top Down Parsing S VP Verb book

Top Down Parsing S VP Verb book X that

Top Down Parsing S VP Verb NP

Top Down Parsing S VP Verb NP book

Top Down Parsing S VP Verb NP book Pronoun

Top Down Parsing S VP Verb NP book Pronoun X that

Top Down Parsing S VP Verb NP book ProperNoun

Top Down Parsing S VP Verb NP book ProperNoun X that

Top Down Parsing S VP Verb NP book Det Nominal

Top Down Parsing S VP Verb NP book Det Nominal that

Top Down Parsing S VP Verb NP book Det Nominal that Noun

Top Down Parsing S VP Verb NP book Det Nominal that Noun flight

Bottom Up Parsing book that flight

Bottom Up Parsing book that flight Noun

Bottom Up Parsing book that flight Noun Nominal

Bottom Up Parsing book that flight Noun Nominal Noun Nominal

Bottom Up Parsing book that flight Noun Nominal Noun Nominal X

Bottom Up Parsing book that flight Noun Nominal PP Nominal

Bottom Up Parsing book that flight Noun Det Nominal PP Nominal

Bottom Up Parsing book that flight Noun Det NP Nominal Nominal PP Nominal

Bottom Up Parsing book that Noun Det NP Nominal flight Noun Nominal PP Nominal

Bottom Up Parsing book that Noun Det NP Nominal flight Noun Nominal PP Nominal

Bottom Up Parsing book that Noun Det NP Nominal flight Noun S VP Nominal PP Nominal

Bottom Up Parsing book that Noun Det NP Nominal flight Noun S VP X Nominal PP Nominal

Bottom Up Parsing book that Noun Det NP Nominal flight Noun Nominal PP Nominal X

Bottom Up Parsing book that Verb Det NP Nominal flight Noun

Bottom Up Parsing book that Verb VP Det NP Nominal flight Noun

Det Bottom Up Parsing book that Verb VP S NP Nominal flight Noun

Det Bottom Up Parsing book that Verb VP S X NP Nominal flight Noun

Bottom Up Parsing book that Verb VP PP Det NP Nominal flight Noun

Bottom Up Parsing book that Verb VP PP Det NP Nominal flight Noun X

Bottom Up Parsing book that Verb VP Det NP Nominal flight Noun NP

Bottom Up Parsing book that Verb VP Det NP Nominal flight Noun

Bottom Up Parsing book that Verb VP Det NP Nominal flight Noun S

Parsing  Pros/Cons?  Top-down: Only examines parses that could be valid parses (i.e. with an S on top) Doesn’t take into account the actual words!  Bottom-up: Only examines structures that have the actual words as the leaves Examines sub-parses that may not result in a valid parse!

Why is parsing hard?  Actual grammars are large  Lots of ambiguity!  Most sentences have many parses  Some sentences have a lot of parses  Even for sentences that are not ambiguous, there is often ambiguity for subtrees (i.e. multiple ways to parse a phrase)

Why is parsing hard? I saw the man on the hill with the telescope What are some interpretations?

Structural Ambiguity Can Give Exponential Parses Me See A man The telescope The hill “I was on the hill that has a telescope when I saw a man.” “I saw a man who was on the hill that has a telescope on it.” “I was on the hill when I used the telescope to see a man.” “I saw a man who was on a hill and who had a telescope.” “Using a telescope, I saw a man who was on a hill.”... I saw the man on the hill with the telescope

Dynamic Programming Parsing  To avoid extensive repeated work you must cache intermediate results, specifically found constituents  Caching (memoizing) is critical to obtaining a polynomial time parsing (recognition) algorithm for CFGs  Dynamic programming algorithms based on both top-down and bottom-up search can achieve O(n 3 ) recognition time where n is the length of the input string.

Dynamic Programming Parsing Methods  CKY (Cocke-Kasami-Younger) algorithm based on bottom-up parsing and requires first normalizing the grammar.  Earley parser is based on top-down parsing and does not require normalizing grammar but is more complex.  These both fall under the general category of chart parsers which retain completed constituents in a chart

CKY  First grammar must be converted to Chomsky normal form (CNF) in which productions must have either exactly 2 non-terminal symbols on the RHS or 1 terminal symbol (lexicon rules).  Parse bottom-up storing phrases formed from all substrings in a triangular table (chart)

CNF Grammar S -> VP VP -> VB NP VP -> VB NP PP NP -> DT NN NP -> NN NP -> NP PP PP -> IN NP DT -> the IN -> with VB -> film VB -> trust NN -> man NN -> film NN -> trust S -> VP VP -> VB NP VP -> VP2 PP VP2 -> VB NP NP -> DT NN NP -> NN NP -> NP PP PP -> IN NP DT -> the IN -> with VB -> film VB -> trust NN -> man NN -> film NN -> trust