Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011

Similar presentations


Presentation on theme: "Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011"— Presentation transcript:

1 Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 28– Grammar; Constituency, Dependency) Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011

2 Grammar A finite set of rules
that generates only and all sentences of a language. that assigns an appropriate structural description to each one.

3 Grammatical Analysis Techniques
Two main devices Labeling the Constituents Breaking up a String Sequential Hierarchical Transformational Morphological Categorial Functional

4 Breaking up and Labeling
Sequential Breaking up Sequential Breaking up and Morphological Labeling Sequential Breaking up and Categorial Labeling Sequential Breaking up and Functional Labeling Hierarchical Breaking up Hierarchical Breaking up and Categorial Labeling Hierarchical Breaking up and Functional Labeling

5 Sequential Breaking up
That student solved the problems. that + student + solve + ed + the + problem + s

6 Sequential Breaking up and Morphological Labeling
That student solved the problems. that student solve ed the problem s word word stem affix word stem affix

7 Sequential Breaking up and Categorial Labeling
This boy can solve the problem. this boy can solve the problem Det N Aux V Det N They called her a taxi. They call ed her a taxi Pron V Affix Pron Det N

8 Sequential Breaking up and Functional Labeling
They called her a taxi Subject Verbal Direct Object Indirect Object They called her a taxi Indirect Object Direct Object Subject Verbal

9 Hierarchical Breaking up
Old men and women Old men and women Old men and women men and women Old men and women Old men and women Old men

10 Hierarchical Breaking up and Categorial Labeling
Poor John ran away. S NP VP A N V Adv Poor John ran away

11 Hierarchical Breaking up and Functional Labeling
Immediate Constituent (IC) Analysis Construction types in terms of the function of the constituents: Predication (subject + predicate) Modification (modifier + head) Complementation (verbal + complement) Subordination (subordinator + dependent unit) Coordination (independent unit + coordinator)

12 Predication [Birds]subject [fly]predicate S Subject Predicate Birds

13 Modification [A]modifier [flower]head
John [slept]head [in the room]modifier S Subject Predicate John Head Modifier slept In the room

14 Complementation He [saw]verbal [a lake]complement S Subject Predicate
19/12/2004 ICON 2004 Tutorial

15 Subordination John slept [in]subordinator [the room]dependent unit S
Subject Predicate John Head Modifier slept Subordinator Dependent Unit in the room

16 Coordination [John came in time] independent unit [but]coordinator [Mary was not ready] independent unit S Independent Unit Coordinator Independent Unit John came in time but Mary was not ready

17 An Example In the morning, the sky looked much brighter. S In the
Modifier Head Subject Predicate Subordinator DU Modifier Modifier Head Head Verbal Complement Modifier Head In the morning, the sky looked much brighter

18 Hierarchical Breaking up and Categorial / Functional Labeling
Hierarchical Breaking up coupled with Categorial /Functional Labeling is a very powerful device. But there are ambiguities which demand something more powerful. E.g., Love of God Someone loves God God loves someone

19 Hierarchical Breaking up
Categorial Labeling Functional Labeling Love of God Love of God Head Modifier Noun Phrase Prepositional Phrase Sub DU love of God love of God

20 Types of Generative Grammar
Finite State Model (sequential) Phrase Structure Model (sequential + hierarchical) + (categorial) Transformational Model (sequential + hierarchical + transformational) + (categorial + functional)

21 Phrase Structure Grammar (PSG)
A phrase-structure grammar G consists of a four tuple (V, T, S, P), where V is a finite set of alphabets (or vocabulary) E.g., N, V, A, Adv, P, NP, VP, AP, AdvP, PP, student, sing, etc. T is a finite set of terminal symbols: T  V E.g., student, sing, etc. S is a distinguished non-terminal symbol, also called start symbol: S  V P is a set of productions.

22 Noun Phrases John the student the intelligent student NP NP NP N Det N
AdjP N John the student the intelligent student

23 Noun Phrase his first five PhD students NP Det Ord Quant N N his first

24 Noun Phrase The five best students of my class NP Det Quant PP AP N

25 Verb Phrases can sing can hit the ball VP VP Aux V Aux V NP can sing

26 Verb Phrase Can give a flower to Mary VP Aux V NP PP can give a flower

27 Verb Phrase may make John the chairman VP Aux V NP NP may make John

28 Verb Phrase may find the book very interesting VP Aux V NP AP may find

29 Prepositional Phrases
in the classroom near the river PP PP P NP P NP in the classroom near the river

30 Adjective Phrases intelligent very honest fond of sweets AP AP AP A
Degree A A PP intelligent very honest fond of sweets

31 that she might have done badly in the assignment
Adjective Phrase very worried that she might have done badly in the assignment AP Degree A S’ very worried that she might have done badly in the assignment

32 Phrase Structure Rules
The boy hit the ball. Rewrite Rules: (i) S  NP VP (ii) NP  Det N (iii) VP  V NP Det  the N  man, ball V  hit We interpret each rule X  Y as the instruction rewrite X as Y.

33 Derivation The boy hit the ball. Sentence NP + VP (i)
Det + N + VP (ii) Det + N + V + NP (iii) The + N + V + NP (iv) The + boy + V + NP (v) The + boy + hit + NP (vi) The + boy + hit + Det + N (ii) The + boy + hit + the + N (iv) The + boy + hit + the + ball (v)

34 PSG Parse Tree The boy hit the ball. S NP VP Det N V NP the Det N boy

35 PSG Parse Tree John wrote those words in the Book of Proverbs. S NP VP
PropN V NP PP NP P NP PP John wrote those words in the book of proverbs

36 Penn POS Tags [John/NNP ] wrote/VBD [ those/DT words/NNS ] in/IN
John wrote those words in the Book of Proverbs. [John/NNP ] wrote/VBD [ those/DT words/NNS ] in/IN [ the/DT Book/NN ] of/IN [ Proverbs/NNS ]

37 Penn Treebank (S (NP-SBJ (NP John)) (VP wrote (NP those words)
John wrote those words in the Book of Proverbs. (S (NP-SBJ (NP John)) (VP wrote (NP those words) (PP-LOC in (NP (NP-TTL (NP the Book) (PP of (NP Proverbs)))

38 PSG Parse Tree Official trading in the shares will start in Paris on Nov 6. S NP VP NP PP Aux V PP PP NP AP N P A official trading in the shares will start in Paris on Nov 6

39 Penn POS Tags [ Official/JJ trading/NN ] in/IN [ the/DT shares/NNS ]
Official trading in the shares will start in Paris on Nov 6. [ Official/JJ trading/NN ] in/IN [ the/DT shares/NNS ] will/MD start/VB in/IN [ Paris/NNP ] on/IN [ Nov./NNP 6/CD ]

40 Penn Treebank Official trading in the shares will start in Paris on Nov 6. ( (S (NP-SBJ (NP Official trading) (PP in (NP the shares))) (VP will (VP start (PP-LOC in (NP Paris)) (PP-TMP on (NP (NP Nov 6)

41 Penn POS Tag Sset Adjective: JJ Adverb: RB Cardinal Number: CD
Determiner: DT Preposition: IN Coordinating Conjunction CC Subordinating Conjunction: IN Singular Noun: NN Plural Noun: NNS Personal Pronoun: PP Proper Noun: NP Verb base form: VB Modal verb: MD Verb (3sg Pres): VBZ Wh-determiner: WDT Wh-pronoun: WP

42 Difference between constituency and dependency

43 Constituency Grammar Categorical Uses part of speech
Context Free Grammar (CFG) Basic elements  Phrases

44 Dependency Grammar Functional Context Free Grammar
Basic elements  Units of Predication/ Modification/ Complementation/ Subordination/ Co-ordination

45 Bridge between Constituency and Dependency parse
Constituency uses phrases Dependencies consist of Head-modifier combination This is a cricket bat. Cricket (Category: N, Functional: Adj) Bat (Category: N, Functional: N) For languages which are free word order we use dependency parser to uncover the relations between the words. Raam ne Shaam ko dekha . (Ram saw Shyam) Shaam ko Ram ne dekha. (Ram saw Shyam) Case markers cling to the nouns they subordinate

46 Example of CG and DG output
Birds Fly. S S NP VP Subject Predicate N V Birds fly Birds fly

47 Some probabilistic parsers and why they are used
Stanford, Collins, Charniack, RASP Why Probabilistic parsers For a single sentence we can have multiple parses. Probability for the parse is calculated and then the parse with the highest probability is selected. This is needed in many applications of NLP, that need parsing.


Download ppt "Pushpak Bhattacharyya CSE Dept., IIT Bombay 21st March, 2011"

Similar presentations


Ads by Google