ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 12: Ontologies and Knowledge Representation PRINCIPLES OF DATA INTEGRATION.

Slides:



Advertisements
Similar presentations
Charting the Potential of Description Logic for the Generation of Referring Expression SELLC, Guangzhou, Dec Yuan Ren, Kees van Deemter and Jeff.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
OWL - DL. DL System A knowledge base (KB) comprises two components, the TBox and the ABox The TBox introduces the terminology, i.e., the vocabulary of.
An Introduction to Description Logics
An Introduction to RDF(S) and a Quick Tour of OWL
1 A Description Logic with Concrete Domains CS848 presentation Presenter: Yongjuan Zou.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
1 Ontology Language Comparisons doug foxvog 16 September 2004.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Chapter 8: Web Ontology Language (OWL) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Description Logics. Outline Knowledge Representation Knowledge Representation Ontology Language Ontology Language Description Logics Description Logics.
DL systems DL and the Web Ilie Savga
Part 6: Description Logics. Languages for Ontologies In early days of Artificial Intelligence, ontologies were represented resorting to non-logic-based.
FiRE Fuzzy Reasoning Engine Nikolaos Simou National Technical University of Athens.
A Really Brief Crash Course in Semantic Web Technologies Rocky Dunlap Spencer Rugaber Georgia Tech.
Set theory Sets: Powerful tool in computer science to solve real world problems. A set is a collection of distinct objects called elements. Traditionally,
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
1 MASWS Multi-Agent Semantic Web Systems: OWL Stephen Potter, CISA, School of Informatics, University of Edinburgh, Edinburgh, UK.
Okech Odhiambo Faculty of Information Technology Strathmore University
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Ming Fang 6/12/2009. Outlines  Classical logics  Introduction to DL  Syntax of DL  Semantics of DL  KR in DL  Reasoning in DL  Applications.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
OWL 2 in use. OWL 2 OWL 2 is a knowledge representation language, designed to formulate, exchange and reason with knowledge about a domain of interest.
Michael Eckert1CS590SW: Web Ontology Language (OWL) Web Ontology Language (OWL) CS590SW: Semantic Web (Winter Quarter 2003) Presentation: Michael Eckert.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
LDK R Logics for Data and Knowledge Representation ClassL (part 3): Reasoning with an ABox 1.
An Introduction to Description Logics (chapter 2 of DLHB)
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Description Logics: Logic foundation of Semantic Web Semantic.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Advanced topics in software engineering (Semantic web)
LDK R Logics for Data and Knowledge Representation Description Logics (ALC)
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
More on Description Logic(s) Frederick Maier. Note Added 10/27/03 So, there are a few errors that will be obvious to some: So, there are a few errors.
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications.
Artificial Intelligence 2004 Ontology
Organization of the Lab Three meetings:  today: general introduction, first steps in Protégé OWL  November 19: second part of tutorial  December 3:
DL Overview Second Pass Ming Fang 06/19/2009. Outlines  Description Languages  Knowledge Representation in DL  Logical Inference in DL.
LDK R Logics for Data and Knowledge Representation ClassL (Propositional Description Logic with Individuals) 1.
The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07.
ONTOLOGY ENGINEERING Lab #2 – September 8,
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
Of 38 lecture 6: rdf – axiomatic semantics and query.
Description Logics Dr. Alexandra I. Cristea. Description Logics Description Logics allow formal concept definitions that can be reasoned about to be expressed.
ece 627 intelligent web: ontology and beyond
Knowledge Representation and Reasoning University "Politehnica" of Bucharest Department of Computer Science Fall 2011 Adina Magda Florea
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Of 29 lecture 15: description logic - introduction.
LDK R Logics for Data and Knowledge Representation Description Logics: family of languages.
Ontology Technology applied to Catalogues Paul Kopp.
Ccs.  Ontologies are used to capture knowledge about some domain of interest. ◦ An ontology describes the concepts in the domain and also the relationships.
Semantics of RDF(S) and OWL Ivan Herman, W3C (Last update: 28 Sep 2007) Ivan Herman.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
The Semantic Web By: Maulik Parikh.
Knowledge Representation Part II Description Logic & Introduction to Protégé Jan Pettersen Nytun.
ece 720 intelligent web: ontology and beyond
Semantic Web Foundations
Introduction to Description Logics
ece 720 intelligent web: ontology and beyond
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Representations & Reasoning Systems (RRS) (2.2)
CIS Monthly Seminar – Software Engineering and Knowledge Management IS Enterprise Modeling Ontologies Presenter : Dr. S. Vasanthapriyan Senior Lecturer.
Presentation transcript:

ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 12: Ontologies and Knowledge Representation PRINCIPLES OF DATA INTEGRATION

Outline  Introduction to Knowledge Representation and its relevance to data integration  Description Logics: a family of KR languages  The Semantic Web and its languages

Knowledge Representation  Knowledge representation (KR) focuses on more expressive languages that database schemata and integrity constraints:  Designed for artificial intelligence applications (e.g., natural language understanding, planning) where complex relationships exist between objects.  KR uses ontologies to represent relationships between elements in a knowledge base.  KR is relevant to data integration because relationships between data sources can be complex.  The use of KR in data integration was investigated since the early days of the field.

KR in Data Integration: Example Mediated schema: ontology with classes and relationships Data sources: S1 has comedies and S2 documentaries S3 : movies with at least two awards S4 : comedies with at least one Oscar

Example: Part 1 S1 is relevant to Q 1 because Comedy is a subclass of Movie (by subsumption)

Example: Part 2 S2 is irrelevant to Q 2 because Comedy and Documentary are declared disjoint.

Example: Part 3(a) S3 is relevant to Q 3 because movies with two awards will definitely satisfy the second subgoal.

Example: Part 3(b) S4 is relevant to Q 3 because oscar is a sub-property of award.

Outline Introduction to Knowledge Representation and its relevance to data integration  Description Logics: a family of KR languages  The Semantic Web and its languages

Description Logics: Introduction  Description Logics are a subset of first-order logic:  Only unary predicates (called concepts) and binary predicates (called roles, properties).  Knowledge bases are composed of:  T-box: defining the concepts and the roles  A-box: including ground facts about individuals  Complex concepts are defined by concept descriptions:  The expressive power of the language is determined by the set of constructors in the grammar of concept descriptions  Complex roles can also be defined via constructors

T-Boxes  Can include statements of the form:  A is a base concept and C can be a concept description.  Example grammar for concept descriptions – see next slide. (it should really be a square inclusion)

An example Grammar for Concept Descriptions. C,D are complex concepts. A is a base concept. Many other constructors possible: union, existential quantification, equality on role paths,…

Example Terminology a 1 : Italians are people (really. Don’t laugh!) a 2 : Comedies are movies a 3 : Comedies are disjoint from documentaries a 4 : Movies have at most one director a 5 : Award movies are those that have at least one award a 6 : Italian hits are award movies whose director is Italian

Abox: the Ground Facts  A set of assertions of the form C(a), or R(a,b)  b is called an R-filler of a.  C and R can be concept descriptions  Akin to asserting that a tuple is in a view rather than in base relations  Below, we state that LaStrada is an Italian hit, we’re not given the director or the award it won.

Semantics of Description Logics  Semantics are based on interpretations.  Given a knowledge base Δ, the models of Δ are the interpretations that are consistent Δ’s T-box and A- box.  Any fact that is true in all models of Δ are said to be entailed by Δ.

Interpretations: Formally  An interpretation I contains a non-empty domain of objects, O I.  I assigns an object a I in to every constant a in the A- box.  We make the unique names assumption: a≠b implies that a I ≠b I  I assigns C I, a subset of O I, to every concept C  I assigns a binary relation R I, a subset of O I x O I to every role R.

Extensions of Complex Expressions The extensions of concept and role descriptions are given by the following equations. (#S denotes the cardinality of the set S).

Conditions on Models  An interpretation of Δ is a model if the following conditions hold:

Example Interpretation Assume an interpretation with the identity mapping on individuals in the knowledge base and a few extra elements ( Director1, Award1, Actor2, …). The following interpretation is a model:

Example Interpretation  Notes:  We do not know the director of LaStrada or its award.  Removing LifeIsBeautiful from Comedy would make it a non-model.  Adding another director would also make it a non-model.

Inference in Description Logics  This is where all the action is: coming up with efficient algorithms for inference and understanding the complexity of the problems.  Subsumption (only for the T-box):  A concept C is said to be subsumed by concept D w.r.t. a T- box T, if in every model, I, of T,  Examples: is subsumed by

Query answering with DLs  The simple case – instance checking:  Does Δ entail C(a) or R(a,b)? i.e., does C(a)/R(a,b) hold in every model of Δ?  The more general problem is query answering. Find the answers to a conjunctive query  where g 1,…, g n can be concept and/or role names.

Semantics of Conjunctive Queries Compute the answer to Q in every model of Δ Any tuple that is in the intersection of the answers is entailed by Δ. This should remind you of the semantics of certain answers! Let’s look at a few examples.

Query Answering: Example 1  Consider the Q1 over the following A-box  Applying Q1 directly to the A-box would yield no answers ( award would not be matched)  However, ItalianHits(LifeIsBeautiful) implies that LifeIsBeautiful won at least one award.  Hence, LifeIsBeautiful should be in the answer!

Query Answering: Example 2  Consider Q2:  With the following A-box  {Comedy(LaFunivia), director(LaFunivia,Antonioni), Italian(Antonioni)}  Neither conjunctive query will yield an answer because we know nothing about awards.  However, we can reason by cases that the following is entailed by Q2.

End of Example 2  Ok, we have  But that’s not enough to infer that LaFunivia should be in the answer.  However, we also know that movies have at most one director, so:  Hence, LaFunivia is an answer to Q2.

Comparing DLs to OODB  Object-oriented databases:  Also focus on unary and binary relations  OODB’s are more focused on modeling the physical aspects of objects and their properties  An object can only belong to a single (most specific) class.  Description logics are about knowledge and complex relationships:  Class membership can be inferred  An individual can belong to multiple classes.

Comparing DLs to Relational Views  In principle, concept descriptions are view definitions  Relational views employ: selection, projection, join, union and apply to more than unary and binary relations.  DLs: universal quantification, number restrictions, intersection, …  Subsumption = query containment  Universal quantification and number restrictions would require negation in conjunctive queries. Hence containment would be undecidable  In DL’s you can put facts directly in views (i.e., complex concept).

Outline Introduction to Knowledge Representation and its relevance to data integration Description Logics: a family of KR languages  The Semantic Web and its languages

The Semantic Web  Basic idea: annotate content on Web pages with semantics  Specify that a web page is about a restaurant, where the address appears on the page, and what are the menu items.  On a page with restaurant reviews, mark the restaurants with a global identifier so the review and restaurant data can be fused.  Without these annotations, systems need to infer this correlations and are often wrong.

Languages, Languages…  RDF: Resource Description Framework  Language for marking up data  Triples with a few cool features  RDF Schema (RDFS): basic schema for RDF documents  OWL: Web Ontology Language. Comes in multiple flavors:  Owl-Lite  Owl-DL, Owl-Full  All these languages are influenced by KR formalisms (some more and some less)

RDF Basics  RDF triples are statements about “resources”  They are of the form:  (subject, predicate, value)  Names can get long (because they can be URLs), so we often use qnames (qualified names)  ex: instead of

RDF as a Graph Uniform Resource Identifiers: available beyond a single data set.

Uniform Resource Identifiers  In a typical database, identifiers are used only internally. They have no meaning outside the database.  RDF uses URI’s for subjects, predicates and optionally for values  Hence, multiple data sets can refer to the same identifier.  Key benefit for data integration!  Note: this does not entail standardization!  You’re free to invent your own, but you’re encouraged to reuse existing URIs so your data meshes well with others

Blank Nodes You can assign IDs to blank nodes, but they are internal to a document. Blank node

Reification  Reification is a way of stating statements about statements:  Provenance, uncertainty, date asserted, …  To reify, make the statement itself into a resource  Once reified, you can state its properties:

RDF Schema  Enables declaring classes, subclass hierarchies, membership in a class, and restrictions on domains and ranges of classes.  Important: a class can be an instance of another class!

RDFS: Declaring Properties  RDFS: you can declare sub-properties, domains and ranges of properties.

OWL: Web Ontology Language  Languages based on description logics but without the unique-names assumption  sameAs and differentFrom specify whether two individuals are the same/different.  OWL-Lite: intersection, number restrictions (but only with 0 or 1), universal and existential quantification on properties.  OWL-DL: + union, complement, disjointness, number restrictions, enumeration ( Sunday, Monday.. ), hasValue (filler for property value), and more.  OWL-Full: OWL-DL + reification.

SparQL: Querying RDF  Language based on matching of triples  Borrows ideas from conjunctive queries and XQuery Result:

SparQL: The Construct Clause

Summary of Chapter 12  Knowledge Representation enables modeling complex relationships between classes and objects.  The languages of the Semantic Web apply these ideas to the Web context and with URI’s.  The constant challenge: the tradeoff between expressive power and computational complexity of reasoning  Question to ponder: Can we live with a fast reasoning algorithm that misses some derivations occasionally?