Download presentation

Presentation is loading. Please wait.

1
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05

2
Outline Lexicalized CFG (Recap) Hw5 and Project 2 Parsing evaluation measures: ParseVal Collin’s parser TAG Parsing summary

3
Lexicalized CFG recap

4
Important equations

5
Lexicalized CFG Lexicalized rules: Sparse data problem –First generate the head –Then generate the unlexicalized rule

6
Lexicalized models

7
An example he likes her

8
An example he likes her

9
Head-head probability

10
Head-rule probability

11
Estimate parameters

12
Building a statistical tool Design a model: –Objective function: generative model vs. discriminative model –Decomposition: independence assumption –The types of parameters and parameter size Training: estimate model parameters –Supervised vs. unsupervised –Smoothing methods Decoding:

13
Team Project 1 (Hw5) Form a team: program language, schedule, expertise, etc. Understand the lexicalized model Design the training algorithm Work out the decoding (parsing) algorithm: augment CYK algorithm. Illustrate the algorithms with a real example.

14
Team Project 2 Task: parse real data with a real grammar extracted from a treebank. Parser: PCFG or lexicalized PCFG Training data: English Penn Treebank Section 02-21 Development data: section 00

15
Team Project 2 (cont) Hw6: extract PCFG from the treebank Hw7: make sure your parser works given real grammar and real sentences; measure parsing performance Hw8: improve parsing results Hw10: write a report and give a presentation

16
Parsing evaluation measures

17
Evaluation of parsers: ParseVal Labeled recall: Labeled precision: Labeled F-measure: Complete match: % of sents where recall and precision are 100% Average crossing: # of crossing per sent No crossing: % of sents which have no crossing.

18
An example Gold standard: (VP (V saw) (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))) Parser output: (VP (V saw) (NP (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))))

19
ParseVal measures Gold standard: (VP, 1, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) System output: (VP, 1, 6), (NP, 2, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) Recall=4/4, Prec=4/5, crossing=0

20
A different annotation Gold standard: (VP (V saw) (NP (Det the) (N’ (N man)) (PP (P with) (NP (Det a) (N’ (N telescope))))) Parser output: (VP (V saw) (NP (Det the) (N’ (N man) (PP (P with) (NP (Det a) (N’ (N telescope)))))))

21
ParseVal measures (cont) Gold standard: (VP, 1, 6), (NP, 2, 3), (N’, 3, 3), (PP, 4, 6), (NP, 5, 6), (N’, 6,6) System output: (VP, 1, 6), (NP, 2, 6), (N’, 3, 6), (PP, 4, 6), (NP, 5, 6), (N’, 6, 6) Recall=4/6, Prec=4/6, crossing=1

22
EVALB A tool that calculates ParseVal measures To run it: evalb –p parameter_file gold_file system_output A copy is available in my dropbox You will need it for Team Project 2

23
Summary of Parsing evaluation measures ParseVal is the widely used: F-measure is the most important The results depend on annotation style EVALB is a tool that calculates ParseVal measures Other measures are used too: e.g., accuracy of dependency links

24
History-based models

25
History-based approaches maps (T, S) into a decision sequence Probability of tree T for sentence S is:

26
History-based models (cont) PCFGs can be viewed as a history-based model There are other history-based models –Magerman’s parser (1995) –Collin’s parsers (1996, 1997, ….) –Charniak’s parsers (1996,1997,….) –Ratnaparkhi’s parser (1997)

27
Collins’ models Model 1: Generative model of (Collins, 1996) Model 2: Add complement/adjunct distinction Model 3: Add wh-movement

28
Model 1 First generate the head constituent label Then generate left and right dependents

29
Model 1(cont)

30
An example Sentence: Last week Marks bought Brooks.

31
Model 2 Generate a head label H Choose left and right subcat frames Generate left and right arguments Generate left and right modifiers

32
An example

33
Model 3 Add Trace and wh-movement Given that the LHS of a rule has a gap, there are three ways to pass down the gap –Head: S(+gap) NP VP(+gap) –Left: S(+gap) NP(+gap) VP –Right: SBAR(that)(+gap) WHNP(that) S(+gap)

34
Parsing results LRLP Model 187.4%88.1% Model 288.1%88.6% Model 388.1%88.6%

35
Tree Adjoining Grammar (TAG)

36
TAG TAG basics: Extension of LTAG –Lexicalized TAG (LTAG) –Synchronous TAG (STAG) –Multi-component TAG (MCTAG) –….

37
TAG basics A tree-rewriting formalism (Joshi et. al, 1975) It can generate mildly context-sensitive languages. The primitive elements of a TAG are elementary trees. Elementary trees are combined by two operations: substitution and adjoining. TAG has been used in –parsing, semantics, discourse, etc. –Machine translation, summarization, generation, etc.

38
Two types of elementary trees VP ADVP ADV still VP* Initial tree:Auxiliary tree: S NP VP VNP draft

39
Substitution operation

40
They draft policies

41
Adjoining operation Y Y*

42
They still draft policies

43
Derivation tree Elementary trees Derived tree Derivation tree

44
Derived tree vs. derivation tree The mapping is not 1-to-1. Finding the best derivation is not the same as finding the best derived tree.

45
S V do S* they PN NP Wh-movement What do they draft ? i S i NP S VP V NP draft N what do PN they i i S NP S V S VP VNP draft what NP N

46
What does John think they draft ? S V does S* S NP VP V S* think Long-distance wh-movement S S NP VP V NP draft i i does think i i S NP S VS VP S NP VP V draft NP what John they

47
Who did you have dinner with? have S NP VP NP V S S* PN who iPP P NP with VP VP* i S NP PN whoPP P NP with VP have S NP V i i

48
TAG extension Lexicalized TAG (LTAG) Synchronized TAG (STAG) Multi-component TAG (MCTAG) ….

49
STAG The primitive elements in STAG are elementary tree pairs. Used for MT

50
Summary of TAG A formalism beyond CFG Primitive elements are trees, not rules Extended domain of locality Two operations: substitution and adjoining Parsing algorithm: Statistical parser for TAG Algorithms for extracting TAG from treebanks.

51
Parsing summary

52
Types of parsers Phrase structure vs. dependency tree Statistical vs. rule-based Grammar-based or not Supervised vs. unsupervised Our focus: Phrase structure Mainly statistical Mainly Grammar-based: CFG, TAG Supervised

53
Grammars Chomsky hierarchy: –Unstricted grammar (type 0) –Context-sensitive grammar –Context-free grammar –Regular grammar Human languages are beyond context-free Other formalism –HPSG, LFG –TAG –Dependency grammars

54
Parsing algorithm for CFG Top-down Bottom-up Top-down with bottom-up filter Earley algorithm CYK algorithm –Requiring CFG to be in CNF –Can be augmented to deal with PCFG, lexicalized CFG, etc.

55
Extensions of CFG PCFG: find the most likely parse trees Lexicalized CFG: –use less strong independence assumption –Account for certain types of lexical and structural dependency

56
Beyond CFG History-based models –Collins’ parsers TAG –Tree-writing –Mildly context-sensitive grammar –Many extensions: LTAG, STAG, …

57
Statistical approach Modeling –Choose the objective function –Decompose the function: Common equations: joint, conditional, marginal probabilities Independency assumptions Training: –Supervised vs. unsupervised –Smoothing Decoding –Dynamic programming –Pruning

58
Evaluation of parsers Accuracy: ParseVal Robustness Resources needed Efficiency Richness

59
Other things Converting into CNF: –CFG –PCFG –Lexicalized CFG Treebank annotation –Tagset: syntactic labels, POS tag, function tag, empty categories –Format: indentation, brackets

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google