Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman.

Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman Speaker: Shasha Liao

Content Name Entity Recognition (NER) Maximum Entropy (ME) System Architecture Results Conclusions

Name Entity Recognition (NER) Give a tokenization of a test corpus and a set of n (n=7) tags, NER is the problem of how to assigning one of (4n+1) tags to each token. – x_begin, x_continue, x_end, x_unique MUC-7: – Proper names (people, organizations, locations) – expressions of time – quantities – monetary values – percentages

Name Entity Recognition (NER) Jim bought 300 shares o Acme Corp. in 2006. per_unnique other qua_unique other org_begin org-end other time_unique other Jim bought 300 shares of Acme Corp. in 2006.

Maximum Entropy (ME) Statistical modeling technique Estimate probability distribution based on partial knowledge Principle: correct probability distribution maximizes entropy (uncertainty) based on what is known

Maximum Entropy (ME) ---build ME model

Maximum Entropy (ME) --- Initialize Features

Maximum Entropy (ME) --- ME Estimation

Maximum Entropy (ME) --- Generalized Interactive Scaling

System Architecture --- Features(1) Feature set – Binary: similar to BBN’s Nymble/Identification system – Lexical: all tokens with a count of 3 or more – Section: date, preamble, text… – Dictionary: name list – External system: futures in other systems become histories – Compound: external system : section feature

System Architecture --- Features(2) Feature selection – Features which activate on default value of a history view.(99% cases are not names) – Lexicons which predict the future ”other” less than 6 times instead of 3 – Features which predict “other” at position token -2 and tokens 2

System Architecture --- Decoding and Viterbi Search Viterbi Search: dynamic programming – Find the highest probability legal path through the lattice of conditional probabilities – Example: Mike England person_start(0.66) gpe_unique(0.6) p(g_u/p_s) = 0 person-start(0.66) person_end(0.3) p(p_e/p_s) =0.7

Result(1)

Result(2) Probable reasons: – Dynamic updating of vocabulary during decoding.( reference resolution) Andrew Borthwick – Binary model VS multi-class model.

Conclusion Future work: – Incorporating long-range reference resolution – Use general compound features – Use Acronyms Advantage of MENE: – Can incorporate previous token’s information – Features can be overlap – Highly portable – Easy to be combined with other systems

Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman.

Similar presentations

Presentation on theme: "Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman.

Similar presentations

Presentation on theme: "Exploiting diverse knowledge source via Maximum Entropy in Name Entity Recognition Author: Andrew Borthwick John Sterling Eugene Agichtein Ralph Grishman."— Presentation transcript:

Similar presentations

About project

Feedback