Natural Language Processing Slides adapted from Pedro Domingos

Slides:



Advertisements
Similar presentations
Natural Language Processing Lecture 2: Semantics.
Advertisements

Natural Language Understanding Difficulties: Large amount of human knowledge assumed – Context is key. Language is pattern-based. Patterns can restrict.
Natural Language Processing Projects Heshaam Feili
Statistical NLP: Lecture 3
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Understanding Natural Language
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
1 Chapter 20 Understanding Language. 2 Chapter 20 Contents (1) l Natural Language Processing l Morphologic Analysis l BNF l Rewrite Rules l Regular Languages.
1 Natural Language Processing zWhat’s the problem? yInput? yOutput?
Natural Language Processing
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Natural Language Processing
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Context Free Grammar S -> NP VP NP -> det (adj) N
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
9/8/20151 Natural Language Processing Lecture Notes 1.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
1 Natural Language Processing Lecture Notes 11 Chapter 15 (part 1)
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
Transition Network Grammars for Natural Language Analysis - W. A. Woods In-Su Yoon Pusan National University School of Electrical and Computer Engineering.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Syntax Why is the structure of language (syntax) important? How do we represent syntax? What does an example grammar for English look like? What strategies.
Artificial Intelligence: Natural Language
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
Introduction to Dialogue Systems. User Input System Output ?
The man bites the dog man bites the dog bites the dog the dog dog Parse Tree NP A N the man bites the dog V N NP S VP A 1. Sentence  noun-phrase verb-phrase.
NLP. Introduction to NLP Background –Developed by Jay Earley in 1970 –No need to convert the grammar to CNF –Left to right Complexity –Faster than O(n.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
CPSC 503 Computational Linguistics
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
Natural Language Processing (NLP)
Stochastic Methods for NLP Probabilistic Context-Free Parsers Probabilistic Lexicalized Context-Free Parsers Hidden Markov Models – Viterbi Algorithm Statistical.
NATURAL LANGUAGE PROCESSING
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Statistical NLP Winter 2009
Statistical NLP: Lecture 3
CSCI 5832 Natural Language Processing
Machine Learning in Natural Language Processing
Natural Language - General
Linguistic Essentials
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Natural Language Processing Slides adapted from Pedro Domingos What’s the problem? Input? Natural Language Sentences Output? Parse Tree Semantic Interpretation (Logical Representation)

Example Applications Enables great user interfaces! Spelling and grammar checkers. Http://www.askjeeves.com/ Document understanding on the WWW. Spoken language control systems: banking, shopping Classification systems for messages, articles. Machine translation tools.

NLP Problem Areas Morphology: structure of words Syntactic interpretation (parsing): create a parse tree of a sentence. Semantic interpretation: translate a sentence into the representation language. Pragmatic interpretation: incorporate current situation into account. Disambiguation: there may be several interpretations. Choose the most probable

Some Difficult Examples From the newspapers: Squad helps dog bite victim. Helicopter powered by human flies. Levy won’t hurt the poor. Once-sagging cloth diaper industry saved by full dumps. Ambiguities: Lexical: meanings of ‘hot’, ‘back’. Syntactic: I heard the music in my room. Referential: The cat ate the mouse. It was ugly.

Parsing Context-free grammars: (2 + X) * (17 + Y) is in the grammar. EXPR -> NUMBER EXPR -> VARIABLE EXPR -> (EXPR + EXPR) EXPR -> (EXPR * EXPR) (2 + X) * (17 + Y) is in the grammar. (2 + (X)) is not. Why do we call them context-free?

Using CFG’s for Parsing Can natural language syntax be captured using a context-free grammar? Yes, no, sort of, for the most part, maybe. Words: nouns, adjectives, verbs, adverbs. Determiners: the, a, this, that Quantifiers: all, some, none Prepositions: in, onto, by, through Connectives: and, or, but, while. Words combine together into phrases: NP, VP

An Example Grammar S -> NP VP VP -> V NP NP -> NAME NP -> ART N ART -> a | the V -> ate | saw N -> cat | mouse NAME -> Sue | Tom

Example Parse The mouse saw Sue.

“Sue bought the cat biscuits Ambiguity “Sue bought the cat biscuits S -> NP VP VP -> V NP VP -> V NP NP NP -> N NP -> N N NP -> Det NP Det -> the V -> ate | saw | bought N -> cat | mouse |biscuits | Sue | Tom

Example: Chart Parsing Three main data structures: a chart, a key list, and a set of edges Chart: Name of terminal or non-terminal 4 length 3 2 1 1 2 3 4 Starting points

Key List and Edges Key list: Push down stack of chart entries “the” “box” “floats” Edges: rules that can be applied to chart entries to build up larger entries 4 length 3 box 2 floats 1 the 1 2 3 4 detthe o

Chart Parsing Algorithm Loop while entries in key list 1. Remove entry from key list 2. If entry already in chart, Add edge list Break 3. Add entry from key list to chart 4. For all rules that begin with entry’s type, add an edge for that rule 5. For all edges that need the entry next, add an extended edge (see algorithm on right) 6. If the edge is finished, add an entry to the key list with type, start point, length, and edge list To extend an edge with chart entry c Create a new edge e’ Set start (e’) to start (e) Set end(e’) to end(e) Set rule(e’) to rule(e) with “o” moved beyond c. Set the righthandside(e’) to the righthandside(e)+c

Try it S -> NP VP VP -> V NP -> Det N Det -> the N -> box V -> floats

Semantic Interpretation Our goal: to translate sentences into a logical form. But: sentences convey more than true/false: It will rain in Seattle tomorrow. Will it rain in Seattle tomorrow? A sentence can be analyzed by: propositional content, and speech act: tell, ask, request, deny, suggest

Propositional Content We develop a logic-like language for representing propositional content: Word-sense ambiguity Scope ambiguity Proper names --> objects (John, Alon) Nouns --> unary predicates (woman, house) Verbs --> transitive: binary predicates (find, go) intransitive: unary predicates (laugh, cry) Quantifiers: most, some Example: Mary: Loves(John, Mary)

Statistical NLP(see book by Charniak, “Statistical Language Learning”, MIT Press, 1993 Consider the problem of tagging part-of-speech: “The box floats” “The”  Det; “Box”  N; “Floats”  V; Given a sentence w(1,n), where w(i) is the i-th word, we want to find tags t(i) assigned to each word w(i)

The Equations Find the t(1,n) that maximizes Assume that P[t(1,n)|w(1,n)]=P[w(1,n)|t(1,n)]/P(w(1,n)) So, only need to maximize P[w(1,n)|t(1,n)] Assume that A word depends only on previous tag A tag depends only on previous tag We have: P[w(j)|w(1,j-1),t(1,j)]=P[w(j)|t(j)], and P[t(j)|w(1,j-1),t(1,j-1)] = P(t(j)|t(j-1)] Thus, want to maximize P[w(n)|t(n-1)]*P[t(n+1)|t(n)]*P[w(n-1)|t(n-2)]*P[t(n)|t(n-1)]…

Example “The box floats”: given a corpus (a training set) Assignment one: T(1)=det, T(2) = V, T(3)=V P(V|det) is rather low, so is P(V|V). Thus is less likely compared to Assignment two: T(t)=det, T(2) = N; t(3) = V P(N|det) is high, and P(V|N) is high, thus is more likely! In general, can use Hidden Markov Models to find probabilities floats the box V det N

Experiments Charniak and Colleagues did some experiments on a collection of documents called the “Brown Corpus”, where tags are assigned by hand. 90% of the corpus are used for training and the other 10% for testing They show they can get 95% correctness with HMM’s. A really simple algorithm: assign t to w by the highest probability tag P(t|w)  91% correctness!

Natural Language Summary Parsing: context free grammars with features. Semantic interpretation: Translate sentences into logic-like language Use additional domain knowledge for word-sense disambiguation. Use context to disambiguate references.