Natural Language - General

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310.
Advertisements

Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Artificial Intelligence 2004
Computerlinguistik II / Sprachtechnologie Vorlesung im SS 2010 (M-GSW-10) Prof. Dr. Udo Hahn Lehrstuhl für Computerlinguistik Institut für Germanistische.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
NATURAL LANGUAGE PROCESSING
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
CSC 594 Topics in AI – Natural Language Processing
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
CS60057 Speech &Natural Language Processing
Basic Parsing with Context Free Grammars Chapter 13
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 6/19/2018
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Probabilistic CKY Parser
Chapter Eight Syntax.
CS : Speech, NLP and the Web/Topics in AI
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 11/16/2018
CS 388: Natural Language Processing: Syntactic Parsing
Lecture 14: Grammar and Parsing (II) November 11, 2004 Dan Jurafsky
Chapter Eight Syntax.
Parsing and More Parsing
CPSC 503 Computational Linguistics
CSA2050 Introduction to Computational Linguistics
David Kauchak CS159 – Spring 2019
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
NLP.
Presentation transcript:

74.419 Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing -

Natural Language - General "Communication is the intentional exchange of information brought about by the production and perception of signs drawn from a shared system of conventional signs." [Russell & Norvig, p.651] (Natural) Language characterized by a sign system common or shared set of signs a systematic procedure to produce combinations of signs a shared meaning of signs and combinations of signs

Natural Language Processing Areas in Natural Language Processing Morphology (word stem + ending) Syntax, Grammar & Parsing (syntactic description & analysis) Semantics & Pragmatics (meaning; constructive; context-dependent; references; ambiguity) Intentions Pragmatic Theory of Language (Communication as Action) Discourse / Dialogue / Text Spoken Language Understanding Language Learning

Natural Language - Parsing Natural Language syntactically described by a formal language, usually a (context-free) grammar: the start-symbol S ≡ sentence non-terminals ≡ syntactic constituents terminals ≡ lexical entries/ words rules ≡ grammar rules Parsing derive the syntactic structure of a sentence based on a language model (grammar) construct a parse tree, i.e. the derivation of the sentence based on the grammar (rewrite system)

Sample Grammar Task: Parse "Does this flight include a meal?" Grammar (S, NT, T, P) – Sentence Symbol S  NT, Part-of-Speech  NT, syntactic Constituents  NT, Grammar Rules P  NT  (NT  T)* S  NP VP statement S  Aux NP VP question S  VP command NP  Det Nominal NP  Proper-Noun Nominal  Noun | Noun Nominal | Nominal PP VP  Verb | Verb NP | Verb PP | Verb NP PP PP  Prep NP Det  that | this | a Noun  book | flight | meal | money Proper-Noun Houston | American Airlines | TWA Verb  book | include | prefer Aux  does Prep  from | to | on Task: Parse "Does this flight include a meal?"

Sample Parse Tree Task: Parse "Does this flight include a meal?" Aux NP VP Det Nominal Verb NP Noun Det Nominal does this flight include a meal

Bottom-up and Top-down Parsing Bottom-up – from word-nodes to sentence-symbol Top-down Parsing – from sentence-symbol to words S Aux NP VP Det Nominal Verb NP Noun Det Nominal does this flight include a meal

Problems with Bottom-up and Top-down Parsing Problems with left-recursive rules like NP  NP PP: don’t know how many times recursion is needed Pure Bottom-up or Top-down Parsing is inefficient because it generates and explores too many structures which in the end turn out to be invalid (several grammar rules applicable  ‘interim’ ambiguity). Combine top-down and bottom-up approach: Start with sentence; use rules top-down (look-ahead); read input; try to find shortest path from input to highest unparsed constituent (from left to right).  Chart-Parsing / Earley-Parser

Problems in Parsing - Ambiguity “One morning, I shot an elephant in my pajamas. How he got into my pajamas, I don’t know.” Groucho Marx syntactical/structural ambiguity – several parse trees are possible e.g. above sentence semantic/lexical ambiguity – several word meanings e.g. bank (where you get money) and (river) bank even different word categories possible (interim) e.g. “He books the flight.” vs. “The books are here.“ or “Fruit flies from the balcony” vs. “Fruit flies are on the balcony.”

Problems in Parsing - Attachment in particular PP (prepositional phrase) binding; often referred to as ‘binding problem’ “One morning, I shot an elephant in my pajamas.” (S ... (NP (PNoun I)(VP (Verb shot) (NP (Det an (Nominal (Noun elephant))) (PP in my pajamas))...) rule VP  Verb NP PP (S ... (NP (PNoun I)) (VP (Verb shot) (NP (Det an) (Nominal (Nominal (Noun elephant) (PP in my pajamas)... ) rule VP  Verb NP and NP  Det Nominal and Nominal  Nominal PP and Nominal  Noun

Chart Parsing / Early Algorithm Earley-Parser based on Chart-Parsing Essence: Integrate top-down and bottom-up parsing. Keep recognized sub-structures (sub-trees) for shared use during parsing. Top-down: Start with S-symbol. Generate all applicable rules for S. Go further down with left-most constituent in rules and add rules for these constituents until you encounter a left-most node on the RHS which is a word category (POS). Bottom-up: Read input word and compare. If word matches, mark as recognized and move parsing on to the next category in the rule(s).

Chart Chart Sequence of n input words; n+1 nodes marked 0 to n. Arcs indicate recognized part of RHS of rule. The • indicates recognized constituents in rules. Jurafsky & Martin, Figure 10.15, p. 380

Chart Parsing / Earley Parser 1 Sequence of input words; n+1 nodes marked 0 to n. States in chart represent possible rules and recognized constituents, with arcs. Interim state S  • VP, [0,0] top-down look at rule S  VP nothing of RHS of rule yet recognized (• is far left) arc at beginning, no coverage (covers no input word; beginning of arc at 0 and end of arc at 0)

Chart Parsing / Earley Parser 2 Interim states NP  Det • Nominal, [1,2] top-down look with rule NP  Det • Nominal Det recognized (• after Det) arc covers one input word which is between node 1 and node 2 look next for Nominal NP  Det Nominal • , [1,3] Nominal was recognized, move • after Nominal move end of arc to cover Nominal (change 2 to 3) structure is completely recognized; arc is inactive; mark NP as recognized in other rules (move • ).

Chart - 0 S  . VP VP . V NP Book this flight

Chart - 1 S  . VP VP . V VP V . NP NP . Det Nom V Book this flight

Chart - 2 S  . VP VP V . NP NP Det . Nom Nom  . Noun V Det Book this flight

Chart - 3a S  . VP VP V . NP NP Det . Nom Nom  Noun . V Det Noun Book this flight

Chart - 3b S  . VP VP V . NP NP Det Nom . Nom  Noun . V Det Noun Book this flight

Chart - 3c VP V NP . S  . VP NP Det Nom . Nom  Noun . V Det Noun Book this flight

Chart - 3d S  VP . VP V NP . NP Det Nom . Nom  Noun . V Det Noun Book this flight

Chart - All States Book this flight S  VP . VP V NP . NP Det Nom . Nom  . Noun NP . Det Nom Nom  Noun . V Det Noun Book this flight

Chart - Final States Book this flight S  VP . VP V NP . NP Det Nom . Nom  Noun . V Det Noun Book this flight

Chart 0 with two S-Rules S  . VP VP . V NP additional rule S  . VP NP Book this flight

Chart - 3 with two S-Rules VP V NP . S  . VP NP Det Nom . Nom  Noun . V Det Noun S  . VP NP Book this flight

Final Chart - with two S-Rules S  VP . S  VP . NP VP V NP . NP Det Nom . Nom  Noun . V Det Noun Book this flight

Chart 0 with two S- and two VP-Rules VP . V NP additional VP-rule VP . V S  . VP additional S-rule S  . VP NP Book this flight

Chart 1a with two S- and two VP-Rules S  . VP VP V . VP V . NP NP . Det Nom V S  . VP NP Book this flight

Chart 1b with two S- and two VP-Rules S  VP . VP V . VP V . NP NP . Det Nom V Book this flight S  VP . NP

Chart 2 with two S- and two VP-Rules S  VP . VP V . VP V . NP NP Det . Nom S  VP . NP Nom  . Noun V Book this flight

Chart 3 with two S- and two VP-Rules S  VP . VP V NP . S  VP NP . NP Det Nom . VP V . Nom  Noun . V Det Noun Book this flight

Final Chart - with two S-and two VP-Rules S  VP . S  VP NP . VP V NP . NP Det Nom . VP V . Nom  Noun . V Det Noun Book this flight

Earley Algorithm - Functions predictor generates new rules for partly recognized RHS with constituent right of • (top-down generation) scanner if word category (POS) is found right of the • , the Scanner reads the next input word and adds a rule for it to the chart (bottom-up mode) completer if rule is completely recognized (the • is far right), the recognition state of earlier rules in the chart advances: the • is moved over the recognized constituent (bottom-up recognition).

Additional References Jurafsky, D. & J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000. (Chapters 9 and 10) Earley Algorithm Jurafsky & Martin, Figure 10.16, p.384 Earley Algorithm - Examples Jurafsky & Martin, Figures 10.17 and 10.18