LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
LING 388: Language and Computers Sandiway Fong Lecture 2.
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 18, March 13, 2007.
1 Words and the Lexicon September 10th 2009 Lecture #3.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
C SC 620 Advanced Topics in Natural Language Processing Sandiway Fong.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/18.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/11.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
LING 388: Language and Computers Sandiway Fong Lecture 28: 12/6.
LING 364: Introduction to Formal Semantics Lecture 4 January 24th.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
LING 388 Language and Computers Take-Home Final Examination 12/9/03 Sandiway FONG.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong. Administrivia Homework 4 – out today – due next Wednesday – (recommend you attempt it early) Reading.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
LING 388: Language and Computers Sandiway Fong Lecture 17.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
LING 388: Language and Computers Sandiway Fong Lecture 7.
Overview Project Goals –Represent a sentence in a parse tree –Use parses in tree to search another tree containing ontology of project management deliverables.
LING 388: Language and Computers Sandiway Fong Lecture 30 12/8.
Natural Language Processing Lecture 6 : Revision.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 13, Feb 16, 2007.
LING 388: Language and Computers Sandiway Fong Lecture 18.
LING 388: Language and Computers Sandiway Fong Lecture 10.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
LING/C SC/PSYC 438/538 Lecture 12 10/4 Sandiway Fong.
Linguistic Essentials
LING 388: Language and Computers Sandiway Fong Lecture 12.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
CSA2050 Introduction to Computational Linguistics Parsing I.
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
LING 388: Language and Computers Sandiway Fong Lecture 21.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Part-of-speech tagging
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong 1.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 11, Feb 9, 2007.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Word classes and part of speech tagging Chapter 5.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong. Last Time Talked about: – 1. Declarative (logical) reading of grammar rules – 2. Prolog query: s(String,[]).
Lecture 9: Part of Speech
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
CSCI 5832 Natural Language Processing
LING/C SC 581: Advanced Computational Linguistics
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
FIRST SEMESTER GRAMMAR
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 22 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 24 Sandiway Fong.
Natural Language Processing
LING/C SC 581: Advanced Computational Linguistics
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Presentation transcript:

LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong

Adminstrivia Homework 7 out today – due Saturday by midnight

Another Grammar for {a n b n c n |n>0} { … } embeds Prolog code inside grammar rules nonvar(X) true if X is not a variable, false otherwise var(X) true if X is a variable, false otherwise

Another Grammar for {a n b n c n |n>0} Set membership: Enumeration:

Homework 7 Question 1: – Give a regular DCG grammar for the language with strings containing: an odd number of a's and an even number of b's (even: 0,2,4,…) assume alphabet is {a,b} – Examples: aaa, abb, babaabb *aab, *ab, *ε – Show your grammar working, i.e. it accepts and rejects positive and negative examples, respectively. – Allow rules of the form: x --> [].

Homework 7 Question 2: – Order your grammar rules so it enumerates strings in the language. – Show it working. – Is your grammar capable of enumerating all the strings of the language? Explain why or why not.

Homework 7 Question 3: – (a) Determine how many strings of size 3 are in the language? List them. – (b) Compute how many strings of size < 10 are there in the language. – Examples: [b,b,a] is of size 3 [a, a, a, b, b, b, b] is of size 7

Writing natural language grammars Need: – Ability to program with grammars – Prolog grammar rules + – Knowledge of language You have the knowledge but it’s not completely accessible to you… Textbook: – chapter 5, sections 1 and 2 – chapter 12

Natural Language Parsing Syntax trees are a big deal in NLP Stanford Parser (see also Berkeley Parser) – – Uses probabilistic rules learnt from the Penn Treebank corpus – Produces parses in the same format – (modulo empty categories and subtags) – Also produces dependency diagrams We do a lot with Treebanks in the follow-on course to this one (LING 581, Spring semester) 9

Natural Language Parsing Penn Treebank (WSJ section): – parsed by human annotators – Efforts by the Hong Kong Futures Exchange to introduce a new interest-rate futures contract continue to hit snags despite the support the proposed instrument enjoys in the colony’s financial community. 10

Natural Language Parsing Penn Treebank (WSJ section): – parsed by human annotators – Efforts by the Hong Kong Futures Exchange to introduce a new interest-rate futures contract continue to hit snags despite the support the proposed instrument enjoys in the colony’s financial community. 11 Tree display software: tregex (from Stanford University)

Natural Language Parsing 12

Natural Language Parsing Comparison between human parse and machine parse: – empty categories not recovered by parsing, otherwise a good match 13

Natural Language Parsing

Part of Speech (POS) JM Chapter 5 Parts of speech – Classic eight parts of speech: e.g. englishclub.com => – traced back to Latin scholars, back further to ancient Greek (Thrax) – not everyone agrees on what they are.. The textbook lists: open class 4 (noun, verbs, adjectives, adverbs) closed class 7 (prepositions, determiners, pronouns, conjunctions, auxiliary verbs, particles, numerals) – or what the subclasses are e.g. what is a Proper Noun? Saturday, April Textbook answer below …

Part of Speech (POS) Getting POS information about a word 1.dictionary 2.pronunciation: e.g. are you conTENT with the CONtent of the slide? 3.possible n-gram sequences e.g. *pronoun << common noun the << common noun 4.structure of the sentence/phrase (Syntax) 5.possible inflectional endings: e.g. V-s/-ed/-en/-ing e.g. N-s In computational linguistics, the Penn Treebank tagset is the most commonly used tagset (reprinted inside the front cover of your textbook) 45 tags listed in textbook 36 POS + 10 punctuation Task: POS tagging

Part of Speech (POS) NNP NNPS

Part of Speech (POS) PRP PRP$

Part of Speech (POS)

Part of Speech (POS) Stanford parser: walk noun/verb

Part of Speech (POS) Stanford parser: walk noun/verb

Part of Speech (POS) Word sense disambiguation is more than POS tagging: different sense of the word bank

Syntax Words combine recursively with one another into phrases (aka constituents) – usually when two words combine, one word will head the phrase e.g [ VB/VBP eat] [ NN chocolate] e.g [ VB/VBP eat] [ DT some][ NN chocolate] projects object projects Warning: terminology and parses in computational linguistics not necessarily the same as that used in linguistics Warning: terminology and parses in computational linguistics not necessarily the same as that used in linguistics

Syntax Words combine recursively with one another into phrases (aka constituents) e.g. [ PRP we][ VB/VBP eat] [ NN chocolate] e.g. [ TO to][ VB/VBP eat] [ NN chocolate] subject

Syntax Words combine recursively with one another into phrases (aka constituents) e.g. [ NNP John][ VBD noticed][ IN/DT/WDT that][PRP we][ VB/VBP eat] [ NN chocolate] projects complementizer CP selects/subcategorizes for

Syntax Words combine recursively with one another into phrases (aka constituents) How about SBAR? cf. John wanted me to eat chocolate

Syntax Words combine recursively with one another into phrases (aka constituents) 1.John noticed that we eat chocolate 2.John noticed we eat chocolate