CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
Intro to NLP - J. Eisner1 Modeling Grammaticality [mostly a blackboard lecture]
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Written Elements Word: Phrase: Clause: Sentence: Paragraph: “Atomic” level of writing Group of related words Subject and verb Independent expression Unity.
1 Jason EisnerNoah A. Smith Johns HopkinsCarnegie Mellon TeachCL ACL – June 20, 2008 Competitive Grammar Writing VP.
Parts of Speech Generally speaking, the “grammatical type” of word: –Verb, Noun, Adjective, Adverb, Article, … We can also include inflection: –Verbs:
Statistical NLP: Lecture 3
Chapter 4 Basics of English Grammar
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Word Classes and English Grammar.
NLP and Speech 2004 English Grammar
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
TagHelper: User’s Manual Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center.
Context Free Grammar S -> NP VP NP -> det (adj) N
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
BIOI 7791 Projects in bioinformatics Spring 2005 March 22 © Kevin B. Cohen.
From Textual Information to Numerical Vectors Chapters Presented by Aaron Hagan.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
NLP Linguistics 101 David Kauchak CS159 – Fall 2014
TagHelper & SIDE Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Chapter 4 Basics of English Grammar Business Communication Copyright 2010 South-Western Cengage Learning.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
NLP LINGUISTICS 101 David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Daily Grammar Practice
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
Linguistic Essentials
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 17 (14/03/06) Prof. Pushpak Bhattacharyya IIT Bombay Formulation of Grammar.
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
CS621: Artificial Intelligence
Grammar Boot Camp Parts of Speech Challenge
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
DERIVATION S RULES USEDPROBABILITY P(s) = Σ j P(T,S) where t is a parse of s = Σ j P(T) P(T) – The probability of a tree T is the product.
Machine Learning in Practice Lecture 13 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Part-of-Speech Tagging CSE 628 Niranjan Balasubramanian Many slides and material from: Ray Mooney (UT Austin) Mausam (IIT Delhi) * * Mausam’s excellent.
When our vacation ended Piper and Levy climbed up in the tree, and they would not answer their mother. 1. Which answer contains the prepositional phrase.
Lecture 9: Part of Speech
Introduction to Machine Learning and Text Mining
Modeling Grammaticality
10 Minutes of Book Love (Have your poem out on your desk, please)
Statistical NLP: Lecture 3
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
CS 2750: Machine Learning Hidden Markov Models
CSCI 5832 Natural Language Processing
Chapter 4 Basics of English Grammar
FIRST SEMESTER GRAMMAR
Daily Grammar Practice
Linguistic Essentials
Chapter 4 Basics of English Grammar
Natural Language Processing
LA: Tuesday, January 22, 2019 Handouts: * Cornell Notes #15: Grammar, Parts of Speech 1 & 2 Homework: * Study for Grammar Test #7, this Friday,
David Kauchak CS159 – Spring 2019
Modeling Grammaticality
Part-of-Speech Tagging Using Hidden Markov Models
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Presentation transcript:

CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng

Goals This is an interactive, hands-on chance to learn about: generative probabilistic models – PCFGs, and simpler forms (an embedded HMM!) parsing algorithms modeling the structure of a language parameter tuning (by hand) quantitative evaluation

Grammar Writing Probabilistic context free grammar (PCFG) for a small subset of English (several hundred words) No programming (code provided) Many interesting insights into grammar construction, parsing, linguistics, ml Prizes :)

Starter PCFG Only 6 initial part of speech (POS) tags Noun: singular nouns Det: singular determiners (a, the, another, …) Prep: prepositions Proper: proper nouns VerbT: singular transitive verbs Misc: other words, see commented sections in the Vocab.gr grammar file!

Starter PCFG (S1.gr) 1 S1 → NP VP. 2 VP → VerbT NP 20 NP→ Det Nbar 1 NP → Proper 20 Nbar → Noun 1 Nbar → Nbar PP 1 PP → Prep NP 1 Noun → castle 1 Noun → king … 1 Proper → Arthur 1 Proper → Guinevere … 1 Det → a 1 Det → every … 1 VerbT → covers 1 VerbT → rides … 1 Misc → that 1 Misc → bloodier 1 Misc → does

Starter PCFG 20 NP→ Det Nbar 1 NP → Proper This means that with p=20/21, you use rule NP→ Det Nbar Intuition: Proper nouns are much less common

Starter PCFG How well can your grammar parse a set of sentences? Answer in the beginning: Not very well ;) Only 2 sentences of the dev set we give you: -Arthur is the king. -Arthur rides the horse near the castle. If grammar fails entirely, it’s really bad  Backoff grammar!

Backoff Grammar S2! # These two rules are required; choose their weights carefully! 99 START -> S1 # mixture of English and backoff grammars 1 START -> S2 Backoff grammar capable of parsing anything (but not very intelligently!) 1 S2 -> Markov 1 Markov -> { Det | Misc | Noun | Prep | Proper | VerbT } (Markov) Backoff grammar parses everything

Evaluation We will mostly evaluate your grammars recall (productivity), but you also get points for weighting up grammatical sentences How well does your grammar anticipate unseen data that are truly grammatical? i.e. your grammar’s ability to predict word strings. Cross-entropy (log-perplexity): captures how close your grammar’s distribution is to the true language distribution

Evaluation: Details Log-perplexity: lower better P(s) – The probability of the string s is the sum of the probabilities of the trees which have that string as their yield

Evaluation Twist The standard evaluation: Unseen test set Competition twist: Full test set comes from you, all the participants! Your grammar should generate sentences that your opponents can’t parse! Only grammatical sentences will be considered (precision).No new words!

Let’s get started! See handout! 1)Form teams of 2–3. 2)Get code 3)Make sure you can parse, generate and validate 4)Start lowering your grammar’s cross-entropy on the provided dev set – You can divide grammar into more files if it’s more convenient 5)10 minutes before the end, submit your grammar with the submit script! 6)Profit.

Questions? Let us know when you’re stuck.

Next Class We will include some grammatical sentences of your grammar into a new dev set so you can all create better grammars. Improve your grammar and push for best perplexity on new dev set & try to generate even harder examples.

Suggestions 1. CC Coordinating conjunction 2. CD Cardinal number 3. DT Determiner 4. EX Existential there 5. FW Foreign word 6. IN Preposition 7. JJ Adjective 8. JJR Adjective, comparative 9. JJS Adjective, superlative 10. LS List item marker 11. MD Modal 12. NN Noun, singular or mass 13. NNS Noun, plural 14. NP Proper noun, singular 15. NPS Proper noun, plural 16. PDT Predeterminer 17. POS Possessive ending 18. PP Personal pronoun 19. PP$ Possessive pronoun 20. RB Adverb 21. RBR Adverb, comparative 22. RBS Adverb, superlative 23. RP Particle 24. SYM Symbol 25. TO to 26. UH Interjection 27. VB Verb, base form 28. VBD Verb, past tense 29. VBG Verb, gerund or present participle 30. VBN Verb, past participle 31. VBP Verb, non-3rd person singular present 32. VBZ Verb, 3rd person singular present 33. WDT Wh-determiner 34. WP Wh-pronoun 35. WP$ Possessive wh-pronoun 36. WRB Wh-adverb

Suggestions Fine grained POS: Coordinating conjunctions, modal verbs, number words, adverbs Base, past and gerund verb forms Personal vs possessive pronouns Negation Questions Subcategorization frames, intransitive verbs Appositives

Feel free to ask us for help!