Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,

Similar presentations


Presentation on theme: "Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,"— Presentation transcript:

1 Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence, 1993.

2 Overview Problem: Domain specific dictionary Portability Practicality Solution: AutoSlog Automatic Training Approach Representative corpus Terrorist event descriptions

3 Information Extraction Selective Concept Extraction CIRCUS: Sentence Analyzer Domain-specific Dictionary Concept node definition Enabling conditions Slots (e.g., victims, perpetrators, …)  Syntactic expectation  Hard and soft constraints

4 Extraction Procedure Sentence CIRCUS Concept Node Dictionary Concept Nodes

5 Observations News reports follow certain stylistic conventions. Assumption: the first reference to a targeted piece of information is most likely where the relationship between that information and the event is made explicit. The immediate linguistic context surrounding the targeted information usually contains the words or phrases that describe its role in the events.

6 Conceptual Anchor Points A conceptual anchor point is a word that should activate a concept CIRCUS: trigger word 13 Heuristics

7 Automated Dictionary Construction Input A set of training text Answer keys Output: a set of concept node definitions Procedure: how to generate a concept node definition?

8 A Good Concept Node Def. passive-verb

9 Domain Specifications A set of mappings from template slots to concept node slots Hard and soft constraints for each type of concept node slot A set of mappings from template types to concept node types

10 Another Good Concept Node Def.

11 A Bad Concept Node Def.

12 Reasons of Bad Definitions When a sentence contains the targeted string but does not describe the event. When a heuristic proposes the wrong conceptual anchor point. When CIRCUS incorrectly analyzes the sentence.

13 Evaluation Terrorist Event Description MUC-4 Corpus 1500 texts 1258 answer keys A human (5 hours) Keeps: 450 edits 4280 string fillers 1237 concept node definitions

14 Comparative Results System/ Test Set RecallPrecisionF-measure MUC-4/TST3465650.51 AutoSlog/TST3435648.65 MUC-4/TST4444041.90 AutoSlog/TST4394541.79 96.3 99.7

15 Conclusions Achieve Above 90% performance of the hand-crafted dictionary (1500 person hours) Improve performance Create dictionary from scratch No explicit domain theory is needed Need an annotated training corpus Make progress for scalable and portable IE systems


Download ppt "Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,"

Similar presentations


Ads by Google