Presentation is loading. Please wait.

Presentation is loading. Please wait.

In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.

Similar presentations


Presentation on theme: "In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009."— Presentation transcript:

1 In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009

2 Syntax 101 Given a sentence, produce a syntax tree (parse) Example: ‘Mary like books’ Software which does this known as a parser

3 Grammars Context-Free Grammar (CFG) ▫Simple rules describing potential configurations ▫From example:  S → NP VP  NP → Mary  VP → V NP  V → likes  NP → books Problems with ambiguity

4 Tree Substitution Grammar (TSG) Incorporates larger tree fragments Substitution operator (◦) combines fragments Context-free grammar is a trivial TSG ◦◦ =

5 Treebanks Database of sentences and corresponding syntax trees ▫Trees are hand-annotated Penn Treebanks among most commonly used Grammars can be created automatically from a treebank (training) ▫Extract rules (CFG) or fragments (TSG) directly from trees

6 Learning Grammar from Treebank Many rules or fragments will occur repeatedly ▫Incorporate frequencies into grammar ▫Probabilistic Context-Free Grammar (PCFG), Stochastic Tree Substitution Grammar (STSG) Data-Oriented Parsing (DOP) model ▫DOP1 (1992): Type of STSG ▫Describes how to extract fragments from a treebank for inclusion in grammar (model) ▫Generally limit fragments to a certain max depth

7 Penn Chinese Treebank Latest version 6.0 (2007) ▫Xinhua newswire (7339 sentences) ▫Sinorama news magazine (7106 sentences) ▫Hong Kong news (519 sentences) ▫ACE Chinese broadcast news (9246 sentences)

8

9 Penn Chinese Treebank and DOP Latest version 6.0 (2007) ▫Xinhua newswire (7339 sentences) ▫Sinorama news magazine (7106 sentences) ▫Hong Kong news (519 sentences) ▫ACE Chinese broadcast news (9246 sentences) Previous experiments (2004) with Penn Chinese Treebank and DOP1 ▫1473 trees selected from Xinhua newswire ▫Fragment depth limited to three levels or less

10 An improved DOP model: DOP* Challenges with DOP1 model ▫Computationally inefficient (exponential increase in number of fragments extracted) ▫Statistically inconsistent A new estimator: DOP* (2005) ▫Limits fragment extraction by estimating optimal fragments using subsets of training corpus  Linear rather than exponential increase in fragments ▫Statistically consistent (accuracy increases as size of training corpus increases)

11 Research Question & Hypothesis Will a DOP* parser applied to the Penn Chinese Treebank show significant improvement in accuracy for a model incorporating fragments up to depth five compared to a model incorporating only fragments up to depth three? Hypothesis: Yes, accuracy will significantly increase ▫Deeper fragments allow parser to capture non- local dependencies in syntax usage/preference

12 Selecting training and testing data Subset of Xinhua newswire (2402 sentences) ▫Includes only IP trees (no headlines or fragments) Excluded sentences of average or greater length Remaining 1402 sentences divided three times into random training/test splits ▫Each test split has 140 sentences ▫Other 1262 sentences used for training

13 Preparing the trees Penn Treebank converted to dopdis format Chinese characters converted to alphanumeric codes Standard tree normalizations ▫Removed empty nodes ▫Removed A over A and X over A unaries ▫Stripped functional tags Original: (IP (NP-PN-SBJ (NR 上海 ) (NR 浦东 )) (VP … Converted: (ip,[(np,[(nr,[(hmeiahodpp_,[])]),(nr,[(hodoohmejc_,[])])]),(vp, …

14 Training & testing the parser DOP* parser is created by training a model with the training trees The parser is then tested by processing the test sentences ▫Parse trees returned by parser are compared with original parse trees from treebank Standard evaluation metrics computed: labeled recall, labeled precision, and f-score (mean) Repeated for each depth level, test/training split

15 Parsing Results DepthLabeled Recall Labeled Precision F-score 1 59.01%58.14%58.57% 3 71.64%67.42%69.47% 5 72.27%67.80%69.96%

16 Other interesting statistics Depth#Fragments Extracted Total Training Time (hours) Total Testing Time (hours) Seconds / Sentence 16,687 1.6590.3368.64 350,533 3.3420.60515.56 5166,760 4.0996.069156.06 Training time at depth-3 and depth-5 is similar, even though depth-5 has much higher fragment count Testing time though at depth-5 is ten times higher than testing time at depth-3!

17 Conclusion Obtain parsing results for other two testing / training splits, if similar: Increasing fragment extraction depth from three to five does not significantly improve accuracy for a DOP* parser over the Penn Chinese Treebank ▫Determine statistical significance ▫Practical benefit is negated by increased parsing time

18 Future Work Increase size of training corpus ▫DOP* estimation consistency: accuracy should increase as larger training corpus used Perform experiment with DOP1 model ▫Accuracy obtained with DOP* lower than previous experiments using DOP1 (Hearne & Way 2004) Qualitative analysis ▫What constructions are captured more accurately?

19 Future Work Perform experiments with other corpora ▫Other sections of Chinese Treebank ▫Other treebanks: Penn Arabic Treebank, … Increase capacity and stability of dopdis system ▫Encountered various failures on larger runs, crashing after as long as 36 hours ▫Efficiency could be increased by larger memory support (64-bit architecture), storage and indexing using a relational database system


Download ppt "In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009."

Similar presentations


Ads by Google