Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Full Parsing by Linear-Chain Conditional Random Fields Yoshimasa Tsuruoka, Jun’ichi Tsujii, and Sophia Ananiadou The University of Manchester.

Similar presentations


Presentation on theme: "Fast Full Parsing by Linear-Chain Conditional Random Fields Yoshimasa Tsuruoka, Jun’ichi Tsujii, and Sophia Ananiadou The University of Manchester."— Presentation transcript:

1 Fast Full Parsing by Linear-Chain Conditional Random Fields Yoshimasa Tsuruoka, Jun’ichi Tsujii, and Sophia Ananiadou The University of Manchester

2 Outline Motivation Parsing algorithm Chunking with conditional random fields Searching for the best parse Experiments Penn Treebank Conclusions

3 Motivation Parsers are useful in many NLP applications – Information extraction, Summarization, MT, etc. But parsing is often the most computationally expensive component in the NLP pipeline Fast parsing is useful when – The document collection is large – e.g. MEDLINE corpus: 70 million sentences – Real-time processing is required – e.g. web applications

4 Parsing algorithms History-based approaches – Bottom-up & left-to-right (Ratnaparkhi, 1997) – Shift-reduce (Sagae & Lavie 2006) Global modeling – Tree CRFs (Finkel et al., 2008; Petrov & Klein 2008) – Reranking (Collins 2000; Charniak & Johnson, 2005) – Forest (Huang, 2008)

5 Chunk parsing Parsing Algorithm 1.Identify phrases in the sequence. 2.Convert the recognized phrases into new non- terminal symbols. 3.Go back to 1. Previous work – Memory-based learning (Tjong Kim Sang, 2001) F-score: – Maximum entropy (Tsuruoka and Tsujii, 2005) F-score: 85.9

6 Parsing a sentence Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. QP NP VP NP S

7 Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. QP NP 1 st iteration

8 volume was a light million ounces. NP VBD DT JJ QP NNS. NP 2 nd iteration

9 volume was ounces. NP VBD NP. VP 3 rd iteration

10 volume was. NP VP. S 4 th iteration

11 was S 5 th iteration

12 Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. QP NP VP NP S Complete parse tree

13 Chunking with CRFs Conditional random fields (CRFs) Features are defined on states and state transitions Feature function Feature weight Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. QP NP

14 Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. Chunking with “IOB” tagging B-NPI-NPOOOB-QPI-QPOO NPQP B : Beginning of a chunk I : Inside (continuation) of the chunk O : Outside of chunks

15 Features for base chunking Estimated volume was a light 2.4 million ounces. VBN NN VBD DT JJ CD CD NNS. ?

16 Features for non-base chunking volume was a light million ounces. NP VBD DT JJ QP NNS. NP VBN NN Estimated volume ?

17 Finding the best parse Scoring the entire parse tree The best derivation can be found by depth-first search.

18 Depth first search POS tagging Chunking (base) Chunking Chunking (base) Chunking

19 Finding the best parse

20 Extracting multiple hypotheses from CRF A* search – Uses a priority queue – Suitable when top n hypotheses are needed Branch-and-bound – Depth-first – Suitable when a probability threshold is given CRF BIOOOB 0.3 BIIOOB 0.2 BIOOOO 0.18

21 Experiments Penn Treebank Corpus – Training:sections 2-21 – Development: section 22 – Evaluation:section 23 Training – Three CRF models Part-of-speech tagger Base chunker Non-base chunker – Took 2 days on AMD Opteron 2.2GHz

22 Training the CRF chunkers Maximum likelihood + L1 regularization L1 regularization helps avoid overfitting and produce compact modes – OWLQN algorithm (Andrew and Gao, 2007)

23 Chunking performance Symbol# SamplesRecallPrecisonF-score NP317, VP76, PP66, S33, ADVP21, ADJP14, ::::: All579, Section 22, all sentences

24 Beam width and parsing performance BeamRecallPrecisionF-scoreTime (sec) Section 22, all sentences (1,700 sentences)

25 Comparison with other parsers RecallPrec.F-scoreTime (min) This work (deterministic) This work (beam = 4) Huang (2008)91.7Unk Finkel et al. (2008) >250 Petrov & Klein (2008)88.33 Sagae & Lavie (2006) Charniak & Johnson (2005) Unk Charniak (2000) Collins (1999) Section 23, all sentences (2,416 sentences)

26 Discussions Improving chunking accuracy – Semi-Markov CRFs (Sarawagi and Cohen, 2004) – Higher order CRFs Increasing the size of training data – Create a treebank by parsing a large number of sentences with an accurate parser – Train the fast parser using the treebank

27 Conclusion Full parsing by cascaded chunking – Chunking with CRFs – Depth-first search Performance – F-score = 86.9 (12msec/sentence) – F-score = 88.4 (42msec/sentence) Available soon


Download ppt "Fast Full Parsing by Linear-Chain Conditional Random Fields Yoshimasa Tsuruoka, Jun’ichi Tsujii, and Sophia Ananiadou The University of Manchester."

Similar presentations


Ads by Google