LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Logics for Data and Knowledge Representation Projects and thesis introduction.
Linked data: P redicting missing properties Klemen Simonic, Jan Rupnik, Primoz Skraba {klemen.simonic, jan.rupnik,
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Knowledge Graph: Connecting Big Data Semantics
Chapter 3 Querying RDF stores with SPARQL. TL;DR We will want to query large RDF datasets, e.g. LOD SPARQL is the SQL of RDF SPARQL is a language to query.
SmartER Semantic Cloud Sevices Karuna P Joshi University of Maryland, Baltimore County Advisors: Dr. Tim Finin, Dr. Yelena Yesha.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
DartGrid Browser-based mapping tool of SQL to RDF Point Template Zhejiang University & OpenLink Software.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
LOD 123: Making the semantic web easier to use Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi.
Information Extraction with Linked Life Data 19/04/2011.
Querying RDF Data with Text Annotated Graphs Lushan Han, Tim Finin, Anupam Joshi and Doreen Cheng SSDBM’15 
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Linked DataTables Automatically Generating Linked Data from Tables Varish Mulwad University of Maryland, Baltimore County November 15, 2011.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
PLATFORM INDEPENDENT SOFTWARE DEVELOPMENT MONITORING Mária Bieliková, Karol Rástočný, Eduard Kuric, et. al.
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
Lushan Han, Tim Finin, Cynthia Parr, Joel Sachs, and Anupam Joshi RDF123: from Spreadsheets to RDF.
Author: William Tunstall-Pedoe Presenter: Bahareh Sarrafzadeh CS 886 Spring 2015.
Publishing and Interacting with Linked Data Roberto Garcia, Josep Maria Brunetti, Antonio López-Muzás, Juan Manuel Gimeno, Rosa Gil WIMS’11 Conference,
2014-May-07. What is the problem? What have others done? What is our solution? Does it work? Outline 2.
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
The Semantic Web: there and back again
Natural Language Questions for the Web of Data 1 Mohamed Yahya, Klaus Berberich, Gerhard Weikum Max Planck Institute for Informatics, Germany 2 Shady Elbassuoni.
T2LD – An automatic framework for extracting, interpreting and representing tables as linked data Varish Mulwad Master’s Thesis Defense Advisor: Dr. Tim.
Government Linked Data Tables Automatically Generating Government Linked Data from Tables Varish Mulwad University of Maryland, Baltimore County.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Using linked data to interpret tables Varish Mulwad September 14,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Steven Seida How Does an RDF Knowledge Store Compare to an RDBMS?
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Chapter 3 Querying RDF stores with SPARQL
CityStateMayorPopulation BaltimoreMDS.C.Rawlings-Blake637,418 SeattleWAM.McGinn617,334 BostonMAT.Menino645,169 RaleighNCC.Meeker405,791 We are laying a.
RDF and Relational Databases
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Linked Data inferringsemanticstables Generating Linked Data by inferring the semantics of tables Varish Mulwad University of Maryland, Baltimore.
Linked Data for the Rest of Us Tim Finin, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 12 January 2012
Making Software Agents Smarter Tim Finin University of Maryland, Baltimore County ICAART 2010, 22 January 2010
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Of 24 lecture 11: ontology – mediation, merging & aligning.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Wikitology Wikipedia as an Ontology
Text Based Similarity Metrics and Delta for Semantic Web Graphs
Lecture 12: Data Wrangling
UMBC AN HONORS UNIVERSITY IN MARYLAND
Property consolidation for entity browsing
DBpedia 2014 Liang Zheng 9.22.
Information Networks: State of the Art
Template-based Question Answering over RDF Data
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March

TL;DR Version Linked Open Data is hard to create Linded Open Data is hard to query Two ongoing UMBC dissertations hope to make it easier – Varish Mulwad: Generating linked data from tables – Lushan Han: Querying linked data with a quasi-NL interface Both need statistics on large amounts of LOD data and/or text 2

Generating Linked Data by Inferring the Semantics of Tables Generating Linked Data by Inferring the Semantics of Tables Research by Varish Mulwad 3

Goal: Table => LOD* nalBasketballAssociationTeams Player height in meters dbprop:team * DBpedia 4

Goal: Table => yago:. is rdfs:label of dbpedia-owl:BasketballPlayer. is rdfs:label of yago:NationalBasketballAssociationTeams. "Michael is rdfs:label of dbpedia:Michael Jordan. dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer. "Chicago is rdfs:label of dbpedia:Chicago Bulls. dbpedia:Chicago Bulls a yago:. is rdfs:label of dbpedia-owl:BasketballPlayer. is rdfs:label of yago:NationalBasketballAssociationTeams. "Michael is rdfs:label of dbpedia:Michael Jordan. dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer. "Chicago is rdfs:label of dbpedia:Chicago Bulls. dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams. RDF Linked Data All this in a completely automated way * DBpedia 5

2010 Preliminary Heuristic System Class prediction for column: 77% Entity Linking for table cells: 66% Examples of class label prediction results: Column – Nationality Prediction – MilitaryConflict Column – Birth Place Prediction – PopulatedPlace Examples of class label prediction results: Column – Nationality Prediction – MilitaryConflict Column – Birth Place Prediction – PopulatedPlace Predict Class for Columns Linking the table cells Identify and Discover relations T2LD Framework 6

KB m,n,o Joint Assignment/ Inference Model Joint Assignment/ Inference Model KB Domain Knowledge – Linked Data Cloud / Medical Domain / Open Govt. Domain Query Linked Data Domain Independent Framework Generate a set of classes/entities x,y,z a,b,c 7

One Challenge: Interpreting Literals Population 690, , , ,000 Age Population? Profit in $K ? Age in years? Percent? Many columns have literals, e.g., numbers Predict properties based on cell values Cyc had hand coded rules: humans don’t live past 120 We extract value distributions from LOD resources Differ for subclasses, e.g., age of people vs. political leaders vs. professional athletes Represent as measurements: value + units Metric: possibility/probability of values given distribution 8

GoRelations: Intuitive Query System for Linked Data GoRelations: Intuitive Query System for Linked Data Research By Lushan Han 9

Querying LOD is Too Hard Querying DBpedia requires a lot of a user – Understand the RDF model – Master SPARQL, a formal query language – Understand ontology terms: 320 classes & 1600 properties ! – Know instance URIs (>1M entities !) – Term heterogeneity (Place vs. PopulatedPlace) Querying large LOD sets overwhelming Natural language query systems still a research goal 10

A Pragmatic Solution Goal: allow users with a basic RDF understanding to query LOD collections – To explore what data is available – To get answers to questions – To create SPARQL queries for reuse or adaptation Desiderata: Easy, Accurate, Fast Key idea: Reduce problem complexity by having (1) User enter a simple graph, and (2) Annotate it words and phrases 11

Semantic Graph Interface Nodes denote entities and links binary relations Entities described by two unrestricted terms: name or value and type or concept Result entities marked with ? and those not with * A compromise between a natural language Q&A system and SPARQL – Users provide compositional structure of the question – Free to use their own terms in annotating the structure 12

Step 1: Find Terms For each concept or relation in the graph, generate the k most semantically similar candidate ontology classes or properties Similarity metric based on distributional similarity, LSA, and WordNet. 13

Another Example Football players who were born in the same place as their team’s president 14

Assemble the best interpretation using statistics of the RDF data Primary measure is pointwise mutual informa- tion (PMI) between RDF terms in LOD collection This measures the degree to which two RDF terms or types occur together in the knowledge base In good interpretations, ontology terms associate in the way that their corresponding user terms connect in the semantic graph Step 2: Disambiguate 15

Example of Translation result Concepts: Place => Place, Author => Writer, Book => Book Properties: born in => birthPlace, wrote => author (inverse direction) 16

The translation of a semantic graph query to SPARQL is straightforward given the mappings Step3: SPARQL Generation Concepts Place => Place Author => Writer Book => Book Relations born in => birthPlace wrote => author 17

Preliminary Evaluation 33 test questions from 2011 Workshop on Question Answering over Linked Data answerable using DBpedia Three human subjects unfamiliar with DBpedia translated the test questions into semantic graph queries Compared with two top natural language QA systems: PowerAqua and True Knowledge PowerAquaTrue Knowledge 18

19

Final Conclusions Linked Data is an emerging paradigm for sharing structured and semi-structured data – Backed by machine-understandable semantics – Based on successful Web languages and protocols Generating and exploring Linked Data resources can be challenging – Schemas are large, too many URIs New tools for mapping tables to Linked Data and translating structured natural language queries help reduce the barriers 20