CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes April 3and5, 2012.

Slides:



Advertisements
Similar presentations
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Advertisements

Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Statistical NLP: Lecture 3
Natural Language Processing Instructor: Paul Tarau, based on Rada Mihalcea’s original slides Fall 2013.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 18, March 13, 2007.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Linguistic Theory Lecture 8 Meaning and Grammar. A brief history In classical and traditional grammar not much distinction was made between grammar and.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
CPSC 503 Computational Linguistics
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Pacific University Sheldon Liang, Ph D Computer Science Department.
Fall 2004 Natural Language Processing Rada Mihalcea.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
9/8/20151 Natural Language Processing Lecture Notes 1.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Natural Language Processing Rada Mihalcea Fall 2008.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Syntax Sudeshna Sarkar 25 Aug Sentence-Types Declaratives: A plane left S -> NP VP Imperatives: Leave! S -> VP Yes-No Questions: Did the plane.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Natural Language Processing Introduction. Any Light at The End of The Tunnel ? Yahoo, Google, Microsoft  Information Retrieval Monster.com, HotJobs.com.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Rules, Movement, Ambiguity
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Natural Language Processing Tasneem Ghnaimat Spring 2013.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
COSC 6336 Natural Language Processing
Natural Language Processing
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
CSCI 5832 Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
CSCI 5832 Natural Language Processing
Natural Language - General
CSCI 5832 Natural Language Processing
CS246: Information Retrieval
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Presentation transcript:

CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes April 3and5, 2012

Why Natural Language Processing ? Huge amounts of data –Internet = at least 20 billion pages –Intranet Applications for processing large amounts of texts require NLP expertise Classify text into categories Index and search large texts Automatic translation Speech understanding –Understand phone conversations Information extraction –Extract useful information from resumes Automatic summarization –Condense 1 book into 1 page Question answering Knowledge acquisition Text generation / dialogues

Natural? Natural Language? –Refers to the language spoken by people, e.g. English, Japanese, Swahili, as opposed to artificial languages, like C++, Java, etc. Natural Language Processing –Applications that deal with natural language in a useful way (beyond token/string matching) Computational Linguistics –Doing linguistics on computers –More on the linguistic side than NLP, but closely related ]

Why Natural Language Processing? kJfmmfj mmmvvv nnnffn333 Uj iheale eleee mnster vensi credur Baboi oi cestnitze Coovoel2^ ekk; ldsllk lkdf vnnjfj? Fgmflmllk mlfm kfre xnnn!

Computers Lack Knowledge! Computers “see” text in English the same you have seen the previous text! People naturally have –“Common sense” –Reasoning capacity –Years of life experience Computers naturally have –No common sense –No reasoning capacity –No life experience

Where does it fit in the CS taxonomy? Computers Artificial Intelligence AlgorithmsDatabasesNetworking Robotics Search Natural Language Processing Information Retrieval Machine Translation Language Analysis SemanticsParsing

Linguistics Levels of Analysis Speech and text (and sign language) Levels –Phonology: sounds / letters / pronunciation –Morphology: the structure of words –Syntax: how these sequences are structured –Semantics: meaning of the strings –Pragmatics: what we use language to accomplish Interaction between levels

Issues in Syntax Shallow parsing: “the dog chased the bear” “the dog” “chased the bear” subject - predicate Identify basic structures NP-[the dog] VP-[chased the bear] Deeper analysis: “the dog ate my homework” –Who did what? (literal meaning) - semantics –The meaning in context - pragmatics

Issues in Syntax Full parsing: John loves Mary Help figuring out (automatically) questions like: Who did what and when?

More Issues Anaphora Resolution: discourse “The dog entered my room. It scared me” Preposition Attachment (syntax & semantics) “I saw the man in the park with a telescope”

Issues in Semantics Understand language! How? “plant” = industrial plant “plant” = living organism Words are ambiguous Importance of semantics? –Machine Translation: wrong translations –Information Retrieval: wrong information –Anaphora Resolution: wrong referents

The sea is home to million of plants and animals English  French [commercial MT system] Le mer est a la maison de billion des usines et des animaux French  English The sea is at the home for billions of factories and animals Why Semantics?

Issues in Semantics How to learn the meaning of words? From dictionaries: word senses plant, works, industrial plant -- (buildings for carrying on industrial labor; "they built a large plant to manufacture automobiles") plant, flora, plant life -- (a living organism lacking the power of locomotion) They are producing about 1,000 automobiles in the new plant The sea flora consists in 1,000 different plant species The plant was close to the farm.

Issues in Semantics Learn from annotated examples: –Assume 100 examples containing “plant” previously tagged by a human –Train a learning algorithm –How to choose the learning algorithm? –How to obtain the 100 tagged examples?

Issues in Pragmatics Why? To modify the beliefs of other agents Why? To change the actions of other agents

Issues in Learning Semantics Learning? –Assume a (large) amount of annotated data = training –Assume a new text not annotated = test Learn from previous experience (training) to classify new data (test) Bayes nets, decision trees, memory based learning (e.g. nearest neighbor), neural networks

Issues in Information Extraction “There was a group of about 8-9 people close to the entrance on Highway 75” Who? “8-9 people” Where? “highway 75” Extract information Detect new patterns: –Detect hacking / hidden information / etc. Gov./mil. puts lots of money put into IE research

Issues in Information Retrieval General model: –A huge collection of texts –A query Task: find documents that are relevant to the given query How? Create an index, like the index in a book More … –Vector-space models –Boolean models Examples: Google, Yahoo, etc.

Issues in Information Retrieval Retrieve specific information Question Answering “What is the height of mount Everest?” 11,000 feet

Issues in Information Retrieval Find information across languages! Cross Language Information Retrieval “What is the minimum age requirement for car rental in Italy?” Search also Italian texts for “eta minima per noleggio macchine” Integrate large number of languages Integrate into performant IR engines

Issues in Machine Translation Text to Text Machine Translations Speech to Speech Machine Translations Most of the work has addressed pairs of widely spread languages like English-French, English- Chinese

Issues in Machine Translations How to translate text? –Learn from previously translated data  Need parallel corpora French-English, Chinese-English have the Hansards Reasonable translations Chinese-Hindi – no such tools available today!

Speech Act Theory “I pronounce you husband & wife” “I sentence you to 5 years”

Natural languages are NOT context free – but almost!

About 40% of words in NY Times are not in a (large) dictionary – Natural language is “productive”

Example: “I SAW A MAN IN THE PARK WITH A TELESCOPE”

Parsing Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesn’t actually mean that the system can select the correct tree from among the possible trees As with everything of interest, parsing involves a search that involves the making of choices Example: “I SAW A MAN IN THE PARK WITH A TELESCOPE”

The problem of “scaling up” – the same as in knowledge representation, planning, etc but even more difficult

Sentence-Types Declaratives: A plane left –S -> NP VP Imperatives: Leave! –S -> VP Yes-No Questions: Did the plane leave? –S -> Aux NP VP WH Questions: When did the plane leave? –S -> WH Aux NP VP

Potential Problems in CFG Agreement Subcategorization Movement

Agreement This dog Those dogs This dog eats Those dogs eat *This dogs *Those dog *This dog eat *Those dogs eats

Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S *John sneezed the book *I prefer United has a flight *Give with a flight Subcat expresses the constraints that a predicate (verb for now) places on the number and type of the argument it wants to take

So? So the various rules for VPs overgenerate. –They permit the presence of strings containing verbs and arguments that don’t go together –For example –VP -> V NP therefore –Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP –Subcategorization frames can help with this problem (“slow down” overgeneration)

Movement Core example – [[My travel agent] NP [booked [the flight] NP ] VP ] S I.e. “book” is a straightforward transitive verb. It expects a single NP arg within the VP as an argument, and a single NP arg as the subject.

Movement What about? Which flight did the travel agent book ? (“Which flight” is the object of the verb “book”. It was “moved” to the front of the sentence!!) – Which flight do you want me to have the travel agent book? The direct object argument to “book” can be a long way from where its supposed to appear. Here it is separated from its verb by 2 other verbs. Therefore NL cannot be a finite state language

Semantics: Fillmore’s Case Grammar “Cases” are semantically based not grammatical ones like in Latin or German Charles Fillmore, “The Case for Case”, 1968 Produced more than one version

Case Grammar Semantics Case grammar semantics: Treats the verb as a predicate and the subject, objects, and other subordinate clauses as “arguments”. Labels the arguments with their relationship to the verb- predicate (called “cases”) [uses subcategorization info] Ex: John sold his car – agent and object cases Ex: John sold his car to Mary – agent, object and recipient cases

Fillmore’s list of cases Agentive (A): the case of the typically animate perceived instigator of the action identified by the verb. Instrumental (I) the case of the inanimate force or object causally involved in the action of state identified by the verb.

Fillmore’s list of cases Dative (D) - later Experiencer (E): the case of the animate being affected by the state or action identified by the verb. Factitive (F) - later Goal (G): the case of the object or being resulting from the action or state identified by the verb, or understood as a part of the meaning of the verb.

Fillmore’s list of cases Locative (L): the case which identified the location or spatial orientation of the state or action identified by the verb. Objective (O): the semantically most neutral case, the case of anything representable by a noun whose role in the action or state identified by the verb is identified by the semantic interpretation of the verb itself.

Case grammar semantics Semantic (case) roles don’t depend simply on syntactic roles Ex: John sold his car to Mary – agent, object and recipient cases Ex: John sold Mary his car Ex: John broke the window Ex: John broke the window with a hammer – agent, object and instrument cases Ex: A hammer broke the window

Informal quiz: Consider these sentences: 1.The burglar opened the door. 2.The door was opened by the burglar. 3.The burglar opened the door with a crowbar. 4.The door was opened by a crowbar. 5.The crowbar opened the door. 6.The door opened.

Case analysis : 1.The burglar opened the door. SOSO 2.The door was opened by the burglar. SASA 3.The burglar opened the door with a crowbar. SOASOA 4.The door was opened by a crowbar. SASA 5.The crowbar opened the door. SOSO 6.The door opened. S

Strengths Only one Noun Phrase occupies each case role in relation to a particular verb Therefore one could classify verbs in terms of which case roles they took. e.g.: o“open” - O, {A} {I} o“shout” - A, O, {E} {} denotes optional elements This model has been used in Artificial Intelligence, along with the sub-categorization of verbs (described earlier)

Weaknesses Researchers could not agree on a standard set of cases. Not always easy in practice to allocate particular Noun Phrases to cases. When it gets difficult there is a temptation to use the Objective (O) as a kind of “dustbin case” for all the NPs that don’t seem to fit anywhere else.