deepschema.org: An Ontology for Typing Entities in the Web of Data

Slides:



Advertisements
Similar presentations
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Advertisements

YAGO: A Large Ontology from Wikipedia and WordNet Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum Max-Planck-Institute for Computer Science, Saarbruecken,
Graph Data Management Lab, School of Computer Science Put conference information here.
USC Graduate Student DayColumbia, SCMarch 2006 Presented by: Jingshan Huang Computer Science & Engineering Department University of South Carolina PhD.
Ontology Notes are from:
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri.
Selima Besbes Essanaa, Nadira Lammari ISID - CEDRIC Laboratory - CNAM - Paris.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
INLS 520 – Erik Mitchell INLS 520 Information Organization.
Semantic Enrichment of Ontology Mappings: A Linguistic-based Approach Patrick Arnold, Erhard Rahm University of Leipzig, Germany 17th East-European Conference.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Content analysis and CERN Roman Chyla. Artificial intelligence Natural language processing Web of data Content analysis.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Very Large Cross-lingual Resources at OAEI 2008 Laura Hollink Véronique Malaisé Vrije Universiteit Amsterdam.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
S calable K nowledge C omposition Ontology Interoperation January 19, 1999 Jan Jannink, Prasenjit Mitra, Srinivasan Pichai, Danladi Verheijen, Gio Wiederhold.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Presented By- Shahina Ferdous, Student ID – , Spring 2010.
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
LINDEN : Linking Named Entities with Knowledge Base via Semantic Knowledge Date : 2013/03/25 Resource : WWW 2012 Advisor : Dr. Jia-Ling Koh Speaker : Wei.
AIFB Ontology Mapping I3CON Workshop PerMIS August 24-26, 2004 Washington D.C., USA Marc Ehrig Institute AIFB, University of Karlsruhe.
Gaby Nativ, SDBI  Motivation  Other Ontologies  System overview  YAGO Dive IN  LEILA  NAGA  Conclusion.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Of 24 lecture 11: ontology – mediation, merging & aligning.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Ontology Evaluation Outline Motivation Evaluation Criteria Evaluation Measures Evaluation Approaches.
1 Visual Computing Institute | Prof. Dr. Torsten W. Kuhlen Virtual Reality & Immersive Visualization Till Petersen-Krauß | GUI Testing | GUI.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Cross-lingual Dataless Classification for Many Languages
DOMAIN ONTOLOGY DESIGN
Cross-lingual Dataless Classification for Many Languages
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Presented by: Hassan Sayyadi
Summarizing Entities: A Survey Report
Social Knowledge Mining
Wikitology Wikipedia as an Ontology
Issues in Knowledge Representation
Semantic Network & Knowledge Graph
Trank: Ranking Entity Types Using the Web of Data
Extracting Semantic Concept Relations
Presented by: Prof. Ali Jaoua
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
Property consolidation for entity browsing
Mapping Ontology classes to Wordnet synsets
Review-Level Aspect-Based Sentiment Analysis Using an Ontology
DBpedia 2014 Liang Zheng 9.22.
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
Effective Entity Recognition and Typing by Relation Phrase-Based Clustering
Leverage Consensus Partition for Domain-Specific Entity Coreference
Enriching Taxonomies With Functional Domain Knowledge
Filtering Properties of Entities By Class
ProBase: common Sense Concept KB and Short Text Understanding
Yago Type Heuristics 丁基伟.
Text Mining Application Programming Chapter 9 Text Categorization
Topic: Semantic Text Mining
Extracting Why Text Segment from Web Based on Grammar-gram
Introduction Dataset search
Presentation transcript:

deepschema.org: An Ontology for Typing Entities in the Web of Data Panayiotis Smeros, Amit Gupta, Michele Catasta and Karl Aberer LDOW ’17. 3 April, 2017. Perth, WA, Australia.

deepschema.org: An Ontology for Typing Entities in the Web of Data What is deepschema.org? Class Hierarchy, Taxonomy Describes all the possible environments than an entity exists in Comprises information from Wikidata and schema.org Main features: generic, cross-domain rich, evolving as fast as the Web traversable accurate 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

deepschema.org: An Ontology for Typing Entities in the Web of Data Related Work YAGO Taxonomy rich, evolving as fast as the Web traversable accurate <wikicat_People_murdered_in_British_Columbia> rdfs:subClassOf <wordnet_person_100007846> . Dbpedia Ontology rich, evolving as fast as the Web traversable accurate Manually constructed; Only 685 classes Wikidata Class Hierarchy rich, evolving as fast as the Web traversable accurate Crowdsourced schema; No tree structure schema.org rich, evolving as fast as the Web traversable accurate Manually constructed; Used by billions of web pages deepschema.org 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Wikidata Class Hierarchy 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Wikidata Class Hierarchy Extraction Phase RDFS Entailment Rules A rdf:type B ⇒ class(B) A rdfs:subClassOf B ⇒ class(A) ∧ class(B) ∧ subclass(A, B) rdfs:subClassOf ≡ P279 (subclass of) rdfs:type ≡ P31 (instance of) Filtering Phase Keep deepschema.org generic Ontologies from domain-specific KBs B A Classes with no English label 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Structure of the Class Hierarchy Classes 123,033 Subclass Relations 126,688 Disconnected Subgraphs 4,263 Root Classes 14,084 Leaf Classes 102,434 Avg num of Subclasses per Class 1.03 Avg depth of Hierarchy 7.93 One subgraph contains 96% of the total classes and 97% of the total subclass relations 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Number of Instances per Wikidata Class The first class with more direct than inherited instances is human 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Language Coverage of Wikidata Classes support of 350 languages For a non-English class hierarchy we get at least ~50% loss of information 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Number of Classes per Wikidata External Contributor 50% of the classes has no provenance information 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Integration with schema.org 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Integration Heuristics Exact Match (language)w  (Language)s Lemma Match (languages)w  (Language)s Head Match (Kalapuyan languages)w  (Language)s Vector Cosine Similarity (warehouse)w  (Store)s Head Similarity (Survey motor boat)w  (Vessel)s Subclass/Instance Similarity 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Resulting pairs Cosine Similarity Threshold # of pairs 0.5 15,112 0.6 schema.org Wikidata deepschema.org subClass equivalentClass Cosine Similarity Threshold # of pairs 0.5 15,112 0.6 8,494 0.7 7,329 0.8 6,120 0.9 5,586 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Evaluation (Accuracy, Traversability and Genericity) 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Accuracy: Crowdsourcing Evaluation CrowdFlower Platform ~100 workers Majority Voting (2 out of 3) 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Accuracy: Crowdsourcing Evaluation Wikidata: 92% schema.org: 100% (assumed) YAGO (Wikipedia Categories - WordΝet): 95% PARIS (YAGO - DBpedia): 90% Comparable with integration techniques for similar data sources 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

deepschema.org: An Ontology for Typing Entities in the Web of Data Traversability False class definition  partOf relations interpreted into subclassOf Topic not covered by schema.org  Class Child Abuse 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

deepschema.org: An Ontology for Typing Entities in the Web of Data Genericity Oxford English Dictionary 3000 most frequent English words Nouns and Noun Phrases 81% coverage 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Conclusions & Future Work Integrate more data sources Facebook’s Open Graph Include customizable filters Control the granularity of the ontology Choose specific topics of interest Employ deepschema.org in use-cases Discover the appropriate type (class) of an entity, given a context deepschema.org generic, cross-domain rich, evolving as fast as the Web traversable accurate 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data

Thanks for your attention! Questions? 03/04/2017 deepschema.org: An Ontology for Typing Entities in the Web of Data