Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011

Slides:



Advertisements
Similar presentations
The Structure of Sentences Asian 401
Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Chapter 4 Syntax.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Introduction to Syntax Owen Rambow September 30.
Techniques of Grammatical Analysis
Analysing Syntax 1 Lesson 8B.
MORPHOLOGY - morphemes are the building blocks that make up words.
Probabilistic Parsing Chapter 14, Part 2 This slide set was adapted from J. Martin, R. Mihalcea, Rebecca Hwa, and Ray Mooney.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
1 Words and the Lexicon September 10th 2009 Lecture #3.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Word Classes and English Grammar.
Matakuliah: G0922/Introduction to Linguistics Tahun: 2008 Session 11 Syntax 2.
NLP and Speech 2004 English Grammar
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Dr. Ansa Hameed Syntax (4).
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Constituency Tests Phrase Structure Rules
Phrases and Sentences: Grammar
How are sentences are constructed?. The boys laughed. MorphemesWords Thethe Boyboys -s laughlaughed -ed.
Constituents  Sentence has internal structure  The structures are represented in our mind  Words in a sentence are grouped into units, and these units.
Meeting 3 Syntax Constituency, Trees, and Rules
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
CS : Speech, Natural Language Processing and the Web/Topics in Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12: Deeper.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Chapter 4 Syntax. Objectives 1. To understand the definition of syntax 2. To study 2 ways of analyzing sentence structure 3. To learn about the syntactical.
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
SYNTAX Lecture -1 SMRITI SINGH.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17.
GrammaticalHierarchy in Information Flow Translation Grammatical Hierarchy in Information Flow Translation CAO Zhixi School of Foreign Studies, Lingnan.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
ENGLISH SYNTAX Introduction to Transformational Grammar.
Albert Gatt Corpora and Statistical Methods Lecture 11.
CS460/626 : Natural Language Processing/Speech, NLP and the Web Some parse tree examples (from quiz 3) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 17 (14/03/06) Prof. Pushpak Bhattacharyya IIT Bombay Formulation of Grammar.
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
1 Chapter 4 Syntax Part III. 2 The infinity of language pp The number of sentences in a language is infinite. 2. The length of sentences is.
TYPES OF PHRASES REPRESENTING THE INTERNAL STRUCTURE OF PHRASES 12/5/2016.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 13 (17/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down Bottom-Up.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
SYNTAX.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CS : Speech, NLP and the Web/Topics in AI
Probabilistic and Lexicalized Parsing
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Natural Language - General
David Kauchak CS159 – Spring 2019
Presentation transcript:

Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 28– Grammar; Constituency, Dependency) Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011

Grammar A finite set of rules that generates only and all sentences of a language. that assigns an appropriate structural description to each one.

Grammatical Analysis Techniques Two main devices Labeling the Constituents Breaking up a String Sequential Hierarchical Transformational Morphological Categorial Functional

Breaking up and Labeling Sequential Breaking up Sequential Breaking up and Morphological Labeling Sequential Breaking up and Categorial Labeling Sequential Breaking up and Functional Labeling Hierarchical Breaking up Hierarchical Breaking up and Categorial Labeling Hierarchical Breaking up and Functional Labeling

Sequential Breaking up That student solved the problems. that + student + solve + ed + the + problem + s

Sequential Breaking up and Morphological Labeling That student solved the problems. that student solve ed the problem s word word stem affix word stem affix

Sequential Breaking up and Categorial Labeling This boy can solve the problem. this boy can solve the problem Det N Aux V Det N They called her a taxi. They call ed her a taxi Pron V Affix Pron Det N

Sequential Breaking up and Functional Labeling They called her a taxi Subject Verbal Direct Object Indirect Object They called her a taxi Indirect Object Direct Object Subject Verbal

Hierarchical Breaking up Old men and women Old men and women Old men and women men and women Old men and women Old men and women Old men

Hierarchical Breaking up and Categorial Labeling Poor John ran away. S NP VP A N V Adv Poor John ran away

Hierarchical Breaking up and Functional Labeling Immediate Constituent (IC) Analysis Construction types in terms of the function of the constituents: Predication (subject + predicate) Modification (modifier + head) Complementation (verbal + complement) Subordination (subordinator + dependent unit) Coordination (independent unit + coordinator)

Predication [Birds]subject [fly]predicate S Subject Predicate Birds

Modification [A]modifier [flower]head John [slept]head [in the room]modifier S Subject Predicate John Head Modifier slept In the room

Complementation He [saw]verbal [a lake]complement S Subject Predicate 19/12/2004 ICON 2004 Tutorial

Subordination John slept [in]subordinator [the room]dependent unit S Subject Predicate John Head Modifier slept Subordinator Dependent Unit in the room

Coordination [John came in time] independent unit [but]coordinator [Mary was not ready] independent unit S Independent Unit Coordinator Independent Unit John came in time but Mary was not ready

An Example In the morning, the sky looked much brighter. S In the Modifier Head Subject Predicate Subordinator DU Modifier Modifier Head Head Verbal Complement Modifier Head In the morning, the sky looked much brighter

Hierarchical Breaking up and Categorial / Functional Labeling Hierarchical Breaking up coupled with Categorial /Functional Labeling is a very powerful device. But there are ambiguities which demand something more powerful. E.g., Love of God Someone loves God God loves someone

Hierarchical Breaking up Categorial Labeling Functional Labeling Love of God Love of God Head Modifier Noun Phrase Prepositional Phrase Sub DU love of God love of God

Types of Generative Grammar Finite State Model (sequential) Phrase Structure Model (sequential + hierarchical) + (categorial) Transformational Model (sequential + hierarchical + transformational) + (categorial + functional)

Phrase Structure Grammar (PSG) A phrase-structure grammar G consists of a four tuple (V, T, S, P), where V is a finite set of alphabets (or vocabulary) E.g., N, V, A, Adv, P, NP, VP, AP, AdvP, PP, student, sing, etc. T is a finite set of terminal symbols: T  V E.g., student, sing, etc. S is a distinguished non-terminal symbol, also called start symbol: S  V P is a set of productions.

Noun Phrases John the student the intelligent student NP NP NP N Det N AdjP N John the student the intelligent student

Noun Phrase his first five PhD students NP Det Ord Quant N N his first

Noun Phrase The five best students of my class NP Det Quant PP AP N

Verb Phrases can sing can hit the ball VP VP Aux V Aux V NP can sing

Verb Phrase Can give a flower to Mary VP Aux V NP PP can give a flower

Verb Phrase may make John the chairman VP Aux V NP NP may make John

Verb Phrase may find the book very interesting VP Aux V NP AP may find

Prepositional Phrases in the classroom near the river PP PP P NP P NP in the classroom near the river

Adjective Phrases intelligent very honest fond of sweets AP AP AP A Degree A A PP intelligent very honest fond of sweets

that she might have done badly in the assignment Adjective Phrase very worried that she might have done badly in the assignment AP Degree A S’ very worried that she might have done badly in the assignment

Phrase Structure Rules The boy hit the ball. Rewrite Rules: (i) S  NP VP (ii) NP  Det N (iii) VP  V NP Det  the N  man, ball V  hit We interpret each rule X  Y as the instruction rewrite X as Y.

Derivation The boy hit the ball. Sentence NP + VP (i) Det + N + VP (ii) Det + N + V + NP (iii) The + N + V + NP (iv) The + boy + V + NP (v) The + boy + hit + NP (vi) The + boy + hit + Det + N (ii) The + boy + hit + the + N (iv) The + boy + hit + the + ball (v)

PSG Parse Tree The boy hit the ball. S NP VP Det N V NP the Det N boy

PSG Parse Tree John wrote those words in the Book of Proverbs. S NP VP PropN V NP PP NP P NP PP John wrote those words in the book of proverbs

Penn POS Tags [John/NNP ] wrote/VBD [ those/DT words/NNS ] in/IN John wrote those words in the Book of Proverbs. [John/NNP ] wrote/VBD [ those/DT words/NNS ] in/IN [ the/DT Book/NN ] of/IN [ Proverbs/NNS ]

Penn Treebank (S (NP-SBJ (NP John)) (VP wrote (NP those words) John wrote those words in the Book of Proverbs. (S (NP-SBJ (NP John)) (VP wrote (NP those words) (PP-LOC in (NP (NP-TTL (NP the Book) (PP of (NP Proverbs)))

PSG Parse Tree Official trading in the shares will start in Paris on Nov 6. S NP VP NP PP Aux V PP PP NP AP N P A official trading in the shares will start in Paris on Nov 6

Penn POS Tags [ Official/JJ trading/NN ] in/IN [ the/DT shares/NNS ] Official trading in the shares will start in Paris on Nov 6. [ Official/JJ trading/NN ] in/IN [ the/DT shares/NNS ] will/MD start/VB in/IN [ Paris/NNP ] on/IN [ Nov./NNP 6/CD ]

Penn Treebank Official trading in the shares will start in Paris on Nov 6. ( (S (NP-SBJ (NP Official trading) (PP in (NP the shares))) (VP will (VP start (PP-LOC in (NP Paris)) (PP-TMP on (NP (NP Nov 6)

Penn POS Tag Sset Adjective: JJ Adverb: RB Cardinal Number: CD Determiner: DT Preposition: IN Coordinating Conjunction CC Subordinating Conjunction: IN Singular Noun: NN Plural Noun: NNS Personal Pronoun: PP Proper Noun: NP Verb base form: VB Modal verb: MD Verb (3sg Pres): VBZ Wh-determiner: WDT Wh-pronoun: WP

Difference between constituency and dependency

Constituency Grammar Categorical Uses part of speech Context Free Grammar (CFG) Basic elements  Phrases

Dependency Grammar Functional Context Free Grammar Basic elements  Units of Predication/ Modification/ Complementation/ Subordination/ Co-ordination

Bridge between Constituency and Dependency parse Constituency uses phrases Dependencies consist of Head-modifier combination This is a cricket bat. Cricket (Category: N, Functional: Adj) Bat (Category: N, Functional: N) For languages which are free word order we use dependency parser to uncover the relations between the words. Raam ne Shaam ko dekha . (Ram saw Shyam) Shaam ko Ram ne dekha. (Ram saw Shyam) Case markers cling to the nouns they subordinate

Example of CG and DG output Birds Fly. S S NP VP Subject Predicate N V Birds fly Birds fly

Some probabilistic parsers and why they are used Stanford, Collins, Charniack, RASP Why Probabilistic parsers For a single sentence we can have multiple parses. Probability for the parse is calculated and then the parse with the highest probability is selected. This is needed in many applications of NLP, that need parsing.