Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Extraction Entity Extraction: Statistical Methods Sunita Sarawagi.

Similar presentations


Presentation on theme: "Information Extraction Entity Extraction: Statistical Methods Sunita Sarawagi."— Presentation transcript:

1 Information Extraction Entity Extraction: Statistical Methods Sunita Sarawagi

2 What Are Statistical Methods? “Statistical methods of entity extraction convert the extraction task to a problem of designing a decomposition of the unstructured text and then labeling various parts of the decomposition, either jointly or independently.” Models Token-level Segment-level Grammar-based Training Likelihood Max-margin

3 Token-level Models Sequence of tokens (characters, words, or n-grams) Entity labels assigned to each token Generalization of classification problem Feature selection important

4 Features Word features Surface word itself is strong indicator of which label to use Orthographic features Capitalization patterns (cap-words) Presence of special characters Alphanumeric generalization of characters in the token Dictionary lookup features f : (x,y, i) → R

5 Models for Labeling Tokens Logistic classifier Support Vector Machine (SVM) Hidden Markov Models (HMMs) Maximum entropy Markov Model (MEMM) Conditional Markov Model (CMM) Conditional Random Fields (CRFs) Single joint distribution Pr(y|x) Scoring function

6 Segment-level Models Sequence of segments Entity labels assigned to each segment Features span multiple tokens

7 Entity-level Features Exact segment match Similarity function such as TF/IDF Segment length

8 Global Segmentation Models Probability distribution Goal is to find segment s such that w·f (x,s) is maximized

9 Grammar-based Models Production rule oriented Produces parse trees Scoring function for each production

10 Training Algorithms Outputs some y Sequence of labels for sequence models Segmentation of x for segment-level models Parse tree for grammar-based models Argmax of s(y) = w·f (x,y) where f(x,y) is a feature vector Two types of training methods Likelihood-based training Max-margin training

11 Likelihood Trainer Probability distribution Log probability distribution Maximize weight vector w

12 Likelihood Trainer

13 Max-margin Training “an extension of support vector machines for training structured models” Find weight vector w

14 Max-margin Training

15 Inference Algorithms

16 MAP for Sequential Labeling

17 MAP for Segmentations

18 MAP for Parse Trees

19 Expected Features Values for Sequential Labelings

20 Summary Most prominent models used Maximum entropy taggers (MaxEnt) Hidden Markov Models (HMMs) Conditional Random Fields (CRFs) CRFs are now established as state-of-the-art Segment-level and grammar-based CRFs not as popular

21 Further Readings Active learning Bootstrapping from structured data Transfer learning from domain adaptation Collective inference


Download ppt "Information Extraction Entity Extraction: Statistical Methods Sunita Sarawagi."

Similar presentations


Ads by Google