Natural Language Processing Vasile Rus

Slides:



Advertisements
Similar presentations
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Advertisements

Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
1 LING 180 Autumn 2007 LINGUIST 180: Introduction to Computational Linguistics Dan Jurafsky, Marie-Catherine de Marneffe Lecture 9: Grammar and Parsing.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
CPSC 503 Computational Linguistics
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
The students will be able to know:
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Speech and Language Processing Lecture 12—02/24/2015 Susan W. Brown.
Session 13 Context-Free Grammars and Language Syntax Introduction to Speech and Natural Language Processing (KOM422 ) Credits: 3(3-0)
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
1 Syntax Sudeshna Sarkar 25 Aug Sentence-Types Declaratives: A plane left S -> NP VP Imperatives: Leave! S -> VP Yes-No Questions: Did the plane.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
10/3/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
Rules, Movement, Ambiguity
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
Syntax Sudeshna Sarkar 25 Aug 2008.
Section 11.3 Features structures in the Grammar ─ Jin Wang.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
Speech and Language Processing Formal Grammars Chapter 12.
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Parsing with Context Free Grammars. Slide 1 Outline Why should you care? Parsing Top-Down Parsing Bottom-Up Parsing Bottom-Up Space (an example) Top -
Natural Language Processing Vasile Rus
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Speech and Language Processing
Chapter Eight Syntax.
Part I: Basics and Constituency
CSC NLP -Context-Free Grammars
Lecture 13: Grammar and Parsing (I) November 9, 2004 Dan Jurafsky
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
Chapter Eight Syntax.
Natural Language - General
CSCI 5832 Natural Language Processing
Heng Ji January 17, 2019 SYNATCTIC PARSING Heng Ji January 17, 2019.
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
David Kauchak CS159 – Spring 2019
David Kauchak CS159 – Spring 2019
Presentation transcript:

Natural Language Processing Vasile Rus

Outline Announcements Syntax Context Free Grammars (CFG) English Phrases Issues in CFG

Announcements MIDTERM after the break!

Syntax Syntax = rules describing how words can connect to each other * that and after year last [incorrect] I saw you yesterday colorless green ideas sleep furiously the kind of implicit knowledge of your native language that you had mastered by the time you were 3 or 4 years old without explicit instruction not necessarily the type of rules you were later taught in school

Syntax Why should you care? –Grammar checkers –Question answering –Information extraction –Machine translation

Context-Free Grammars Capture constituency and ordering –Ordering is easy What are the rules that govern the ordering of words and bigger units in the language –What’s constituency? How do words group into units and what can we say about how the various kinds of units behave

CFG Examples S -> NP VP NP -> Det NOMINAL NOMINAL -> Noun VP -> Verb Det -> a Noun -> flight Verb -> left these rules are defined independent of the context where they might occur -> CFG

CFGs S -> NP VP –This says that there are units called S, NP, and VP in this language –That an S consists of an NP followed immediately by a VP –Doesn’t say that that’s the only kind of S –Nor does it say that this is the only place that NPs and VPs occur Recognition vs Generativity –Generate strings in the language –Reject strings not in the language –Impose structures (trees) on strings in the language

Derivations A derivation is a sequence of rules applied to a string that accounts for that string –Covers all the elements in the string –Covers only the elements in the string

Derivations as Trees I prefer a morning flight

Parsing Parsing is the process of taking a string and a grammar and returning a (many?) parse tree(s) for that string

Other Options Regular languages (expressions) –Too weak (cannot deal with recursion ) Context-sensitive or Turing equiv –Too powerful / computationally intractable

Context? The notion of context in CFGs is not the same as the ordinary meaning of the word context in language All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself –A -> B C –Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found

Key Constituents (English) Sentences –Noun phrases –Verb phrases –Prepositional phrases

Sentence-Types Declaratives: A plane left –S -> NP VP Imperatives: Leave! –S -> VP Yes-No Questions: Did the plane leave? –S -> Aux NP VP WH Questions: When did the plane leave? –S -> WH Aux NP VP

Recursion We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly). –NP -> NP PP[[The flight] [to Boston]] –VP -> VP PP[[departed Miami] [at noon]]

Recursion Of course, this is what makes syntax interesting –flights from Denver –Flights from Denver to Miami –Flights from Denver to Miami in February –Flights from Denver to Miami in February on a Friday –Flights from Denver to Miami in February on a Friday under $300 –Flights from Denver to Miami in February on a Friday under $300 with lunch

The Point If you have a rule like –VP -> V NP It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP

The Point VP -> V NP I hate –flights from Denver –Flights from Denver to Miami –Flights from Denver to Miami in February –Flights from Denver to Miami in February on a Friday –Flights from Denver to Miami in February on a Friday under $300 –Flights from Denver to Miami in February on a Friday under $300 with lunch

Conjunctive Constructions S -> S and S –John went to NY and Mary followed him NP -> NP and NP VP -> VP and VP … In fact the right rule for English is X -> X and X

Potential Problems in CFG Agreement Subcategorization Movement

Agreement This dog Those dogs This dog eats Those dogs eat *This dogs *Those dog *This dog eat *Those dogs eats

Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S … *John sneezed the book *I prefer United has a flight *Give with a flight Subcat expresses the constraints that a predicate (verb for now) places on the number and type of the argument it wants to take

So? So the various rules for VPs overgenerate –They permit the presence of strings containing verbs and arguments that don’t go together –For example: VP -> V NP therefore Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP Subcategorization frames can fix this problem (“slow down” overgeneration)

Movement Core example –[[My travel agent] NP [booked [the flight] NP ] VP ] S i.e. “book” is a straightforward transitive verb. It expects a single NP within the VP as an argument, and a single NP arg as the subject.

Movement What about? –Which flight do you want me to have the travel agent book? The direct object argument to “book” isn’t appearing in the right place. It is in fact a long way from where its supposed to appear And note that it is separated from its verb by 2 other verbs

Formally… To put all previous discussions/examples in a formal definition for CFG: A context free grammar has four parameters: 1.A set of non-terminal symbols N 2.A set of terminal symbols T 3.A set of production rules P, each of the form A  a, where A is a non-terminal, and a is a string of symbols from the infinite set of strings (T  N)* 4.A designated start symbol S

Grammar equivalence and normal form Strong equivalence: –two grammars are strongly equivalent if: they generate the same set of strings they assign the same phrase structure to each sentence –two grammars are weakly equivalent if: they generate the same set of strings they do not assign the same phrase structure to each sentence Normal form –Restrict the form of productions –Chomsky Normal Form (CNF) –Right hand side of the productions has either two non-terminals, or one terminal –e.g. A -> BC A -> a –Any grammar can be translated into a weakly equivalent CNF –A -> B C D A-> B X X -> C D

Summary Syntax Context Free Grammars (CFG) English Phrases Issues in CFG

Next Time Context Free Grammars (CFG) and Parsing Algorithms