Presentation is loading. Please wait.

Presentation is loading. Please wait.

Survey of Semantic Annotation Platforms

Similar presentations


Presentation on theme: "Survey of Semantic Annotation Platforms"— Presentation transcript:

1 Survey of Semantic Annotation Platforms
SAC 2005 Survey of Semantic Annotation Platforms Lawrence Reeve Hyoil Han

2 Semantic Annotation Creating semantic labels within documents for the Semantic Web Used to support: Advanced searching (e.g. concept) Information Visualization (using ontology) Reasoning about Web resources Converting syntactic structures into knowledge structures (humanmachine)

3 Semantic Annotation Process

4 Semantic Annotation Concerns
Scale, Volume Existing & new documents on the Web Manual annotation Expensive – economic, time Subject to personal motivation Schema Complexity Storage support for multiple ontologies within or external to source document? Knowledge base refinement Access - How are annotations accessed? API, custom UI, plug-ins

5 Semantic Annotation Platforms
Why semantic annotation platforms (‘SAPs’)? Reduces human involvement Consistent application of ontologies Reduced cost – economic & time Scalability Multiple ontologies for single document

6 Semantic Annotation Platforms
Characteristics Provide many services, not just annotation Storage: ontology, KB, and annotation Access APIs (query annotations) Integrate information extraction methods Support for IE (gazetteers) Extensible

7 SAP General Architecture

8 SAP Classification

9 SAP Classification Pattern-based Pattern-discovery Rules
Iterative learning provide initial seed set find new entities  find new patterns repeat Rules Manually define rules to find entities in text Simple label matching

10 SAP Classification Machine-learning based Wrapper Induction
LP2 Uses structural and linguistic information Produces tagging & correction rules as output Statistical models Hidden Markov Model

11 SAP Classification Multistrategy
Combine pattern and machine-learning approaches Did not find a platform that implements this approach Platform extensibility important for implementation

12 Semantic Annotation Platforms
Selection Idea is to get a representative sample of platforms using various information extraction techniques System needed to be a platform offering services, not just algorithm

13 Semantic Annotation Platforms

14 Language Toolkits GATE – language processing system
Component architecture, SDK, IDE ANNIE (‘A Nearly-New IE system’) tokenizer, gazetteer, POS tagger, sentence splitter, etc JAPE – Java Annotations Pattern Engine provides regular-expression based pattern/action rules Amilcare adaptive IE system designed for document annotation based on LP2 uses ANNIE

15 KIM (2003) ontology, kb, semantic annotation, indexing and retrieval server, front-ends (Web UI, IE plug-in) KIMO ontology 250 classes, 100 properties 80,000 entities from general news corpus in KB (plus >100,000 aliases) IE Uses GATE, JAPE Gazetteers (from KB) Source:

16 Ont-O-Mat (2002) Uses Amilcare Extensible Wrapper induction (LP2)
Adapted in 2004 for PANKOW algorithm Disambiguation by maximal evidence Proper nouns + ontology  linguistic phrases Source: kcap2001-annotate-sub.pdf

17 MUSE (2003) Pipeline of processing resources (PRs) Makes use of JAPE
PRs called conditionally based on text attributes Makes use of JAPE Adaptive rules Can link multiple resources together Gazetteer + part-of-speech tagger Resolve entity ambiguities Source:

18 SemTag (2003) Large-scale annotation Uses the TAP taxonomy
Annotations separate from source “Semantic Label Bureau” Uses the TAP taxonomy Approach is: Find match to label in taxonomy Save window before & after match Perform disambiguation Main contribution is using taxonomy for disambiguation Source: resources/semtag.pdf

19 Platform Effectiveness
*as reported by platform authors

20 Summary Several platforms developed in last several years
Large implementation effort; many services Differentiated by IE methods used Services provided Future IE integration will likely improve annotation accuracy Extension of existing platforms will allow for quicker research


Download ppt "Survey of Semantic Annotation Platforms"

Similar presentations


Ads by Google