October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
CS Basic Parsing with Context-Free Grammars.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 4705 Basic Parsing with Context-Free Grammars.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
October 2008csa3180: Setence Parsing Algorithms 1 1 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Context-Free Parsing Read J & M Chapter 10.. Basic Parsing Facts Regular LanguagesContext-Free Languages Required Automaton FSMPDA Algorithm to get rid.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
CSA2050 Introduction to Computational Linguistics Parsing I.
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
November 2004csa3050: Sentence Parsing II1 CSA350: NLP Algorithms Sentence Parsing 2 Top Down Bottom-Up Left Corner BUP Implementation in Prolog.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Computerlinguistik II / Sprachtechnologie Vorlesung im SS 2010 (M-GSW-10) Prof. Dr. Udo Hahn Lehrstuhl für Computerlinguistik Institut für Germanistische.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Recursive Data Structures and Grammars Themes –Recursive Description of Data Structures –Grammars and Parsing –Recursive Definitions of Properties of Data.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Speech and Language Processing SLP Chapter 13 Parsing.
Basic Parsing with Context Free Grammars Chapter 13
Natural Language - General
Parsing and More Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
CSA2050 Introduction to Computational Linguistics
csa3180: Setence Parsing Algorithms 1
David Kauchak CS159 – Spring 2019
Presentation transcript:

October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies

October 2005csa3180: Parsing Algorithms 12 References This lecture is based on material found in –Jurafsky & Martin chapter 10 Relevant material available from Vince.

October 2005csa3180: Parsing Algorithms 13 Why not use FS techniques for parsing NL sentences Descriptive Adequacy –some NL phenomena cannot be described within FS framework. –example: central embedding Notational Adequacy –Elegance with which notation describes the real-world objects. Elegance implies Notation which allows short descriptions. Notation which exploits similarities between different structures and permits general properties to be stated. Representation of dependency and hierarchy

October 2005csa3180: Parsing Algorithms 14 Central Embedding The following sentences –The cat spat 1 1 –The cat the boy saw spat –The cat the boy the girl liked saw spat Require at least a grammar of the form S → A n B n

October 2005csa3180: Parsing Algorithms 15 DCG-style Grammar/Lexicon s--> np, vp. s --> aux, np, vp. s --> vp. np --> det nom. nom --> noun. nom --> noun, nom. nom --> nom, pp pp --> prep, np. np --> pn. vp --> v. vp --> v np d --> [that];[this];[a]. n --> [book];[flight]; [meal];[money]. v--> [book];[include]; [prefer]. aux --> [does]. prep --> [from];[to];[on]. pn --> [‘Houston’];[‘TWA’].

October 2005csa3180: Parsing Algorithms 16 Parse Tree A valid parse tree for a grammar G is a tree –whose root is the start symbol for G –whose interior nodes are nonterminals of G –whose children of a node T (from left to right) correspond to the symbols on the right hand side of some production for T in G. –whose leaf nodes are terminal symbols of G. Every sentence generated by a grammar has a corresponding valid parse tree Every valid parse tree exactly covers a sentence generated by the grammar

October 2005csa3180: Parsing Algorithms 17 Parsing Problem Given grammar G and sentence A find all valid parse trees for G that exactly cover A S VP NP V Det Nom N book that flight

October 2005csa3180: Parsing Algorithms 18 Soundness and Completeness A parser is sound if every parse tree it returns is valid. A parser is complete for grammar G if for all s  L(G) –it terminates –it produces the corresponding parse tree For many purposes, we settle for sound but incomplete parsers

October 2005csa3180: Parsing Algorithms 19 Parsing as Search Search within a space defined by –Start State –Goal State –State to state transformations Two distinct parsing strategies: –Top down –Bottom up Different parsing strategy, different state space, different problem. Parsing strategy ≠ search strategy

October 2005csa3180: Parsing Algorithms 110 Top Down Each state is a tree (which encodes the current state of the parse). Top down parser tries to build from the root node S down to the leaves by replacing nodes with non-terminal labels with RHS of corresponding grammar rules. Nodes with pre-terminal (word class) labels are compared to input words.

October 2005csa3180: Parsing Algorithms 111 Top Down Search Space Start node → Goal node ↓

October 2005csa3180: Parsing Algorithms 112 Bottom Up Each state is a forest of trees. Start node is a forest of nodes labelled with pre-terminal categories (word classes derived from lexicon) Transformations look for places where RHS of rules can fit. Any such place is replaced with a node labelled with LHS of rule.

October 2005csa3180: Parsing Algorithms 113 Bottom Up Search Space fl

October 2005csa3180: Parsing Algorithms 114 Top Down vs Bottom Up General Top down –For: Never wastes time exploring trees that cannot be derived from S –Against: Can generate trees that are not consistent with the input Bottom up –For: Never wastes time building trees that cannot lead to input text segments. –Against: Can generate subtrees that can never lead to an S node.

October 2005csa3180: Parsing Algorithms 115 Top Down Parsing - Remarks Top-down parsers do well if there is useful grammar driven control: search can be directed by the grammar. Left recursive rules can cause problems. A top-down parser will do badly if there are many different rules for the same LHS. Consider if there are 600 rules for S, 599 of which start with NP, but one of which starts with V, and the sentence starts with V. Top-down is unsuitable for rewriting parts of speech (preterminals) with words (terminals). In practice that is always done bottom-up as lexical lookup. Useless work: expands things that are possible top-down but not there. Repeated work: anywhere there is common substructure

October 2005csa3180: Parsing Algorithms 116 Bottom Up Parsing - Remarks Empty categories: termination problem unless rewriting of empty constituents is somehow restricted (but then it’s generally incomplete) Inefficient when there is great lexical ambiguity (grammar driven control might help here) Conversely, it is data-directed: it attempts to parse the words that are there. Both TD (LL) and BU (LR) parsers can do work exponential in the sentence length on NLP problems Useless work: locally possible, but globally impossible. Repeated work: anywhere there is common substructure

October 2005csa3180: Parsing Algorithms 117 Development of a Concrete Strategy Combine best features of both top down and bottom up strategies. –Top down, grammar directed control. –Bottom up filtering. Examination of alternatives in parallel uses too much memory. Depth first strategy using agenda-based control.

October 2005csa3180: Parsing Algorithms 118 Top Down Algorithm

October 2005csa3180: Parsing Algorithms 119 Derivation top down, left-to- right, depth first

October 2005csa3180: Parsing Algorithms 120 A Problem with the Algorithm Note that the first three steps of the parse involve a failed attempt to expand the first rule S → NP VP. The parser recursively expands the leftmost NT of this rule (NP). While all this work is going on, the input is not even consulted! Only when a terminal symbol is encountered is the input compared and the failure discovered. This is pretty inefficient.