Presentation is loading. Please wait.

Presentation is loading. Please wait.

09/12/2003 Peer-to-Peer Information Systems – WS 03/04 1 Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy, Zachary G.

Similar presentations


Presentation on theme: "09/12/2003 Peer-to-Peer Information Systems – WS 03/04 1 Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy, Zachary G."— Presentation transcript:

1 09/12/2003 Peer-to-Peer Information Systems – WS 03/04 1 Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy, Zachary G. Ives, Peter Mork, Igor Tatarinov. Speaker: Sergey Chernov Tutor: Jens Graupmann

2 09/12/2003Peer-to-Peer Information Systems – WS 03/042 Outline 1. 1. INTRODUCTION. SEMANTIC WEB. 2. 2. PIAZZA: SYSTEM OVERVIEW 3. 3. IMPLEMENTATION DETAILS 3.1 MAPPING LANGUAGE 3.2 QUERY ANSWERING ALGORITHM 4. 4. CONCLUSIONS.

3 09/12/2003Peer-to-Peer Information Systems – WS 03/043 Introduction ► Goal:  Data Integration and Knowledge Management ► Problem:  Web data lacks machine-understandable semantics ► Solution:  Semantic Web?

4 09/12/2003Peer-to-Peer Information Systems – WS 03/044 The Semantic Web * ► ► Web sites include structural annotations   You can pose meaningful queries on them.   Ontologies provide the semantic glue.   Internal implementation of web sites left open. ► ► Agents perform tasks:   Query one or more web sites   Perform updates (e.g., set schedules)   Coordinate actions   Trust each other (or not). ► ► I.e., agents operating on a gigantic heterogeneous distributed database. (*View by A. Halevy)

5 09/12/2003Peer-to-Peer Information Systems – WS 03/045 General requirements ► ► Robust infrastructure for querying   Peer data management systems. ► ► Facilitate mapping between different structures. Need tools for:   Locating relevant structures   Easily joining the semantic web. ► ► Get data into structured form   Should we worry about the legacy web?

6 09/12/2003Peer-to-Peer Information Systems – WS 03/046 Using views for specifying mappings ► ► Local-As-View (LAV). Data sources can be described as views over the mediated schema. ► ► Global-As-View (GAV). Mediated schema can be described as a set of views over the data sources. Mediated Schema Site B Site A Site C Mediated Schema Site BSite ASite C

7 09/12/2003Peer-to-Peer Information Systems – WS 03/047 Mapping ► Mapping AB specifies representation of structured data from scheme of node A into scheme of node B Mediated Schema Site B Site A Site C Mapping “AB” Mapping “BA” Mapping “BC” Mapping “CB” Mapping “C-MS” Mapping “MS-C” Mapping “A-MS” Mapping “MS-A”

8 09/12/2003Peer-to-Peer Information Systems – WS 03/048 Piazza: Peer Data-Management System ► Goal:  Large scale autonomous sharing of structured data ► Peer data management system (PDMS)  Autonomous Peers export data in their own schemas  Pair-wise mappings between peers  Generalization of a Data Integration system  NOT a P2P file sharing system

9 09/12/2003Peer-to-Peer Information Systems – WS 03/049 Relationship of PDMS to… ► P2P overlay networks (the “Structured World”) ► Data integration systems (no central logical mediated schema) ► Federated databases (scale, ad-hoc nature) ► Distributed databases (no central administration)

10 09/12/2003Peer-to-Peer Information Systems – WS 03/0410 Representing Data ► ► A spectrum of possibilities:   Relational tables, some integrity constraints   XML: can encode relational, hierarchical ► ► Xquery – emerging standard query language (SQL for XML)   RDF: “XML on drugs”. ► ► Sees only the logic; ignores other aspects.   DAML+OIL ► ► Full-blown Knowledge representation language. ► ► They all have semantics; just different expressive powers. ► ► We keep the data simple. Mappings between data at different peers are more complex.

11 09/12/2003Peer-to-Peer Information Systems – WS 03/0411 Peer Data Management ► Mappings are query expressions  DbResearcher(x)  Researcher(x),Area(x,DB)  DbResearcher(x), Office(x,DBLab) = DbLabMember(x) DB Projects MIT UW UCB Stanford Area(areaID, name, descr) Project(projID, name, sponsor) ProjArea(projID, areaID) Pubs(pubID, projName, title, venue, year) Author(pubID, author) Member(projName, member) Project(projID, name, descr) Student(studID, name, status) Faculty(facID, name, rank, office) Advisor(facID, studID) ProjMember(projID, memberID) Paper(papID, title, forum, year) Author(authorID, paperID) Area(areaID, name, descr) Project(projID, areaID, name) Pub(pubID, title, venue, year) PubAuthor(pubID, authorID) PubProj(pubID, projID) Member(memID, projID, name, pos) Alumn(name, year, thesis) Members(memID, name) Projects(projID, name, startDate) ProjFaculty(projID, facID) ProjStudents(projID, studID) … Direction(dirID, name) Project(pID, dirID, name) …

12 09/12/2003Peer-to-Peer Information Systems – WS 03/0412 Piazza mapping language (1) Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type {: $a IN document(“source.xml”)\ /authors/author $t IN $a/publication/title, $typ IN $a/publication/pub-type WHERE $typ = “book” : } { $t } {: $a/full-name :} ► XML/XML Example

13 09/12/2003Peer-to-Peer Information Systems – WS 03/0413 Piazza mapping language (2) Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type ► piazza:id attribute {: $a IN document(“source.xml”)\ /authors/author $t IN $a/publication/title, $typ IN $a/publication/pub-type WHERE $typ = “book” : } { $t } {: $a/full-name :}

14 09/12/2003Peer-to-Peer Information Systems – WS 03/0414 Piazza mapping language (3) Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type ► Partial mapping {: $a IN document(“source.xml”)\ /authors/author $t IN $a/publication/title, $typ IN $a/publication/pub-type WHERE $typ = “book” : } PROPERTY $t >=’A’ AND $t < ‘B’ : } [: {: PROPERTY $this IN {“PrintersInc”, “PubsInc”} :} :]

15 09/12/2003Peer-to-Peer Information Systems – WS 03/0415 Query Answering Algorithm ► Problem  Evaluate query Q at P 1 given a network of mappings ► Reformulate the query over all relevant peers  Chaining of mappings using a combination of query composition and query rewriting ► Q P1 (x) :- DbResearcher(x)  Query Composition ► M: DbResearcher(x)  Researcher(x),Area(x,DB)  Q P2 (x)  Researcher(x),Area(x,DB)  Q P2 (x)  Researcher(x),Area(x,DB)  Query Rewriting ► M: DbResearcher(x), Office(x,DBLab) = DbLabMember(x)  Q P3 (x)  DbLabMember(x)  Q P3 (x)  DbLabMember(x)

16 09/12/2003Peer-to-Peer Information Systems – WS 03/0416 Query Reformulation (1) Mapping: {: $people=/S1/people :} {: $name=$people/faculty/name/text():} { $name} {: $student=$people/student/text():} { $student } {: $faculty=$people/faculty, $name=$faculty/name/text(), $advisee=$faculty/advisee/text() where $advisee=$student :} { $name } { for $faculty in /S1/people/faculty, $name in $faculty/name/text(), $advisee in $faculty/advisee/text() where $name = “Ullman” return {$advisee} } Query:

17 09/12/2003Peer-to-Peer Information Systems – WS 03/0417 Query Reformulation (2) { for $faculty in /S1/people/faculty, $name in $faculty/name/text(), $advisee in $faculty/advisee/text() where $name = “Ullman” return {$advisee} } Query: name advisee $name = “Ullman” {$advisee} S1 people faculty S1 people faculty name {$name} student {$student} faculty name advisee $advisee=$student {$name} Query tree pattern: Mapping tree pattern:

18 09/12/2003Peer-to-Peer Information Systems – WS 03/0418 Query Reformulation (3) Query: name advisee $name = “Ullman” {$advisee} S1 people faculty S1 people faculty name {$name} student {$student} faculty name advisee $advisee=$student {$name} Query tree pattern: Mapping tree pattern: { for $faculty in /S2/people/student, $advisor in $student/advisor/text(), $name in $student/name/text() where $advisor = “Ullman” return { $name } }

19 09/12/2003Peer-to-Peer Information Systems – WS 03/0419 Reformulation times ► Table 1: The test queries and their respective running times. QueryDescriptionReformulation time# of reformulations Q1XML-related projects.0.5 sec12 Q2 Co-authors who reviewed each other's work. 0.9 sec25 Q3 PC members with a paper at the same conference. 0.2 sec3 Q4 PC chairs of recent conferences + their projects. 0.5 sec24 Q5 Conflicts-of-interest of PC members. 0.7 sec36

20 09/12/2003Peer-to-Peer Information Systems – WS 03/0420 Current and the Future ► Current status  Demo scenario using XML  Looking at real domains (Bio dbs, NASA dbs) ► Future Work  More efficient reformulation algorithm  Semantic network analysis – eliminate redundant mappings and inconsistent mappings  Query caching to speed up query evaluation

21 09/12/2003Peer-to-Peer Information Systems – WS 03/0421 Conclusions ► ► Mapping language for mapping between sets of XML source nodes with different document structures ► ► Architecture that uses the transitive closure of mappings to answer queries ► ► Algorithm for query answering over this transitive closure of mappings, which is able to follow mappings in both forward and reverse directions

22 09/12/2003Peer-to-Peer Information Systems – WS 03/0422 Thank You!

23 09/12/2003Peer-to-Peer Information Systems – WS 03/0423 Further literature 1. Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov: Schema Mediation for Large-Scale Semantic Data Sharing 2. Igor Tatarinov, Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Suciu, Nilesh Dalvi, Xin (Luna) Dong, Yana Kadiyska, Gerome Miklau, Peter Mork: The Piazza Peer Data Management Project 3. Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov: Schema Mediation in Peer Data Management Systems 4. Alon Halevy, Oren Etzioni, AnHai Doan, Zachary Ives, Jayant Madhavan, Luke McDowell, Igor Tatarinov: Crossing the Structure Chasm 5. Madhan Arumugam, Amit Sheth, and I. Budak Arpinar: Towards Peer-to-Peer Semantic Web: A Distributed Environment for Sharing Semantic Knowledge on the Web 6. Hendler J., Berners-Lee T., Miller E.: Integrating Applications on the Semantic Web


Download ppt "09/12/2003 Peer-to-Peer Information Systems – WS 03/04 1 Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy, Zachary G."

Similar presentations


Ads by Google