Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bidirectional CRF for NER

Similar presentations


Presentation on theme: "Bidirectional CRF for NER"— Presentation transcript:

1 Bidirectional CRF for NER
Jin Mao Postdoc, School of Information, University of Arizona Sept 14th, 2016

2 CRF Interpretation Sequence Orders Feature Styles Model Integration
AGENDA CRF Interpretation Sequence Orders Feature Styles Model Integration

3 CRF Interpretation CRF Let X :=(x1,...,xT ) be a sequence of tokens from a tokenized input sentence, Y :=(y1,...,yT ) be a sequence of tags. A CRF tagger selects Y that maximizes its conditional probability given X: a vector of binary features weights

4 CRF Interpretation CRF The probability of a tag yt depends only on its neighboring tags, given the entire sentence X The tagging by the CRF models is mainly determined by the features fα(yt−1,yt ,X) and their corresponding weights θα.

5 Define features in factored representation (Sha and Pereira, 2003)
CRF Interpretation Feature function Define features in factored representation (Sha and Pereira, 2003) p(X,t) is a predicate (i.e. a boolean function) on X and current position t q(yt−1,yt ) is a predicate on pairs of tags, transition predicate

6 CRF Sequence Orders Forward VS Backward Forward Backward T-1+1 T-t+2

7 CRF Sequence Orders Forward VS Backward
For t-th token, used the Kullback-Leibler(KL) divergence to measure the information gain from a prior distribution of yt to its posterior distribution after either previous or next tag becomes available: The training corpus from BioCreative 2: The next tag could be more helpful than the previous tag

8 CRF Sequence Orders Forward VS Backward
The prediction accuracy of tag bigrams

9 HMM-style features described in Lafferty et al. (2001)
CRF Feature Styles Feature function MALLET style: HMM-style features described in Lafferty et al. (2001) observation-independent state transition features

10 A CRF model is symmetric.
CRF Feature Styles Symmetric Property A CRF model is symmetric.

11 CRF Feature Styles CRF with HMM-style features is symmetric, if:
Symmetric Property CRF with HMM-style features is symmetric, if: two special tags are attached to the head and tail of Y the training set is the same with different orders all p predicates are either defined on a single token or symmetrically with regard to current position t.

12 CRF Feature Styles Thus, Symmetric Property Forward T Backward T-t+2
g2(yB; T-(t-1)+1) But, g1 + g2 for t-th token are not identical?

13 CRF Feature Styles Mallet-style: Symmetric Property t Forward Backward
T-t+2 T-t+1 T-t T-t+1 T-1+1 t-1 Mallet-style: g1 is not symmetric any more! g1(X,Y; t) ≠ g1(XB,YB; tB)

14 CRF Feature Styles CRF++: HMM-style (default), Mallet-style(optional)
Tools CRF++: HMM-style (default), Mallet-style(optional) Mallet: Mallet-style(default), HMM-style and others(optional)

15 CRF Model Integration Label results 10 best results with scores
Find matches

16 CRF Model Integration Integration Method
Simple set operations, intersection and union, failed to improve the performance because they lead to trade-off between recall and precision. Heuristic method: (1) compute the intersection of bi-directional parsing and select the solution in the intersection that minimizes the sum of its output scores; (2) for the other 18 solutions, select the labeled terms appearing in a dictionary with its length greater than three. approved gene symbols and aliases obtained from HUGO (Eyre et al., 2006) for the last step.

17 CRF Model Integration Integration Method Heuristic method:
(1) Add more orders, 1, 2, 3, to Mallet (2) Find the intersection with the lowest scores. (3) If no tagging result appears in the top 10 lists of all models, the best tagging result of Order 1 backward model will be selected simply (4) Find the intersection of two CRF++ models (5) Union CRF++ results and Mallet results

18 CRF Model Integration Integration Method

19 Conclusions (1) Different types of feature construction affect whether a CRF model is symmetric. (2) Backward parsing models enjoy a slight advantage over forward parsing according to the information gain analysis. (3) The combination of different models can achieve higher F-scores.

20 Reference This presentation is from: Hsu, C. N., Chang, Y. M., Kuo, C. J., Lin, Y. S., Huang, H. S., & Chung, I. F. (2008). Integrating high dimensional bi-directional parsing models for gene mention tagging. Bioinformatics, 24(13), i286-i294.

21 Thank you!


Download ppt "Bidirectional CRF for NER"

Similar presentations


Ads by Google