Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.

Similar presentations


Presentation on theme: "Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05."— Presentation transcript:

1 Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05

2 Outline Lexicalized CFG (Recap) Hw5 and Project 2 Parsing evaluation measures: ParseVal Collin’s parser TAG Parsing summary

3 Lexicalized CFG recap

4 Important equations

5 Lexicalized CFG Lexicalized rules: Sparse data problem –First generate the head –Then generate the unlexicalized rule

6 Lexicalized models

7 An example he likes her

8 An example he likes her

9 Head-head probability

10 Head-rule probability

11 Estimate parameters

12 Building a statistical tool Design a model: –Objective function: generative model vs. discriminative model –Decomposition: independence assumption –The types of parameters and parameter size Training: estimate model parameters –Supervised vs. unsupervised –Smoothing methods Decoding:

13 Team Project 1 (Hw5) Form a team: program language, schedule, expertise, etc. Understand the lexicalized model Design the training algorithm Work out the decoding (parsing) algorithm: augment CYK algorithm. Illustrate the algorithms with a real example.

14 Team Project 2 Task: parse real data with a real grammar extracted from a treebank. Parser: PCFG or lexicalized PCFG Training data: English Penn Treebank Section 02-21 Development data: section 00

15 Team Project 2 (cont) Hw6: extract PCFG from the treebank Hw7: make sure your parser works given real grammar and real sentences; measure parsing performance Hw8: improve parsing results Hw10: write a report and give a presentation

16 Parsing evaluation measures

17 Evaluation of parsers: ParseVal Labeled recall: Labeled precision: Labeled F-measure: Complete match: % of sents where recall and precision are 100% Average crossing: # of crossing per sent No crossing: % of sents which have no crossing.

18 An example Gold standard: (VP (V saw) (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))) Parser output: (VP (V saw) (NP (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))))

19 ParseVal measures Gold standard: (VP, 1, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) System output: (VP, 1, 6), (NP, 2, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) Recall=4/4, Prec=4/5, crossing=0

20 A different annotation Gold standard: (VP (V saw) (NP (Det the) (N’ (N man)) (PP (P with) (NP (Det a) (N’ (N telescope))))) Parser output: (VP (V saw) (NP (Det the) (N’ (N man) (PP (P with) (NP (Det a) (N’ (N telescope)))))))

21 ParseVal measures (cont) Gold standard: (VP, 1, 6), (NP, 2, 3), (N’, 3, 3), (PP, 4, 6), (NP, 5, 6), (N’, 6,6) System output: (VP, 1, 6), (NP, 2, 6), (N’, 3, 6), (PP, 4, 6), (NP, 5, 6), (N’, 6, 6) Recall=4/6, Prec=4/6, crossing=1

22 EVALB A tool that calculates ParseVal measures To run it: evalb –p parameter_file gold_file system_output A copy is available in my dropbox You will need it for Team Project 2

23 Summary of Parsing evaluation measures ParseVal is the widely used: F-measure is the most important The results depend on annotation style EVALB is a tool that calculates ParseVal measures Other measures are used too: e.g., accuracy of dependency links

24 History-based models

25 History-based approaches maps (T, S) into a decision sequence Probability of tree T for sentence S is:

26 History-based models (cont) PCFGs can be viewed as a history-based model There are other history-based models –Magerman’s parser (1995) –Collin’s parsers (1996, 1997, ….) –Charniak’s parsers (1996,1997,….) –Ratnaparkhi’s parser (1997)

27 Collins’ models Model 1: Generative model of (Collins, 1996) Model 2: Add complement/adjunct distinction Model 3: Add wh-movement

28 Model 1 First generate the head constituent label Then generate left and right dependents

29 Model 1(cont)

30 An example Sentence: Last week Marks bought Brooks.

31 Model 2 Generate a head label H Choose left and right subcat frames Generate left and right arguments Generate left and right modifiers

32 An example

33 Model 3 Add Trace and wh-movement Given that the LHS of a rule has a gap, there are three ways to pass down the gap –Head: S(+gap)  NP VP(+gap) –Left: S(+gap)  NP(+gap) VP –Right: SBAR(that)(+gap)  WHNP(that) S(+gap)

34 Parsing results LRLP Model 187.4%88.1% Model 288.1%88.6% Model 388.1%88.6%

35 Tree Adjoining Grammar (TAG)

36 TAG TAG basics: Extension of LTAG –Lexicalized TAG (LTAG) –Synchronous TAG (STAG) –Multi-component TAG (MCTAG) –….

37 TAG basics A tree-rewriting formalism (Joshi et. al, 1975) It can generate mildly context-sensitive languages. The primitive elements of a TAG are elementary trees. Elementary trees are combined by two operations: substitution and adjoining. TAG has been used in –parsing, semantics, discourse, etc. –Machine translation, summarization, generation, etc.

38 Two types of elementary trees VP ADVP ADV still VP* Initial tree:Auxiliary tree: S NP VP VNP draft

39 Substitution operation

40 They draft policies

41 Adjoining operation Y Y*

42 They still draft policies

43 Derivation tree Elementary trees Derived tree Derivation tree

44 Derived tree vs. derivation tree The mapping is not 1-to-1. Finding the best derivation is not the same as finding the best derived tree.

45 S V do S* they PN NP Wh-movement What do they draft ? i S i NP S VP V NP draft N what do PN they i i S NP S V S VP VNP draft what NP N

46 What does John think they draft ? S V does S* S NP VP V S* think Long-distance wh-movement S S NP VP V NP draft i i does think i i S NP S VS VP S NP VP V draft NP what John they

47 Who did you have dinner with? have S NP VP NP V S S* PN who iPP P NP with VP VP* i S NP PN whoPP P NP with VP have S NP V i i

48 TAG extension Lexicalized TAG (LTAG) Synchronized TAG (STAG) Multi-component TAG (MCTAG) ….

49 STAG The primitive elements in STAG are elementary tree pairs. Used for MT

50 Summary of TAG A formalism beyond CFG Primitive elements are trees, not rules Extended domain of locality Two operations: substitution and adjoining Parsing algorithm: Statistical parser for TAG Algorithms for extracting TAG from treebanks.

51 Parsing summary

52 Types of parsers Phrase structure vs. dependency tree Statistical vs. rule-based Grammar-based or not Supervised vs. unsupervised Our focus:  Phrase structure  Mainly statistical  Mainly Grammar-based: CFG, TAG  Supervised

53 Grammars Chomsky hierarchy: –Unstricted grammar (type 0) –Context-sensitive grammar –Context-free grammar –Regular grammar  Human languages are beyond context-free Other formalism –HPSG, LFG –TAG –Dependency grammars

54 Parsing algorithm for CFG Top-down Bottom-up Top-down with bottom-up filter Earley algorithm CYK algorithm –Requiring CFG to be in CNF –Can be augmented to deal with PCFG, lexicalized CFG, etc.

55 Extensions of CFG PCFG: find the most likely parse trees Lexicalized CFG: –use less strong independence assumption –Account for certain types of lexical and structural dependency

56 Beyond CFG History-based models –Collins’ parsers TAG –Tree-writing –Mildly context-sensitive grammar –Many extensions: LTAG, STAG, …

57 Statistical approach Modeling –Choose the objective function –Decompose the function: Common equations: joint, conditional, marginal probabilities Independency assumptions Training: –Supervised vs. unsupervised –Smoothing Decoding –Dynamic programming –Pruning

58 Evaluation of parsers Accuracy: ParseVal Robustness Resources needed Efficiency Richness

59 Other things Converting into CNF: –CFG –PCFG –Lexicalized CFG Treebank annotation –Tagset: syntactic labels, POS tag, function tag, empty categories –Format: indentation, brackets


Download ppt "Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05."

Similar presentations


Ads by Google