Presentation is loading. Please wait.

Presentation is loading. Please wait.

WISDOM D0.P1 – Integrated System Protoype 1 WISDOM (Web Intelligent Search based on DOMain ontologies): Demo Sonia BergamaschiPaolo BouquetPaolo Ciaccia.

Similar presentations


Presentation on theme: "WISDOM D0.P1 – Integrated System Protoype 1 WISDOM (Web Intelligent Search based on DOMain ontologies): Demo Sonia BergamaschiPaolo BouquetPaolo Ciaccia."— Presentation transcript:

1 WISDOM D0.P1 – Integrated System Protoype 1 WISDOM (Web Intelligent Search based on DOMain ontologies): Demo Sonia BergamaschiPaolo BouquetPaolo Ciaccia Paolo Merialdo Sonia Bergamaschi [1], Paolo Bouquet [2], Paolo Ciaccia [3], and Paolo Merialdo [4] [1] Università degli Studi di Modena e Reggio Emilia [2] Università degli Studi di Trento [3] Università degli Studi di Bologna [4] Università degli Studi Roma Tre 6 Dicembre 2006 http://www.dbgroup.unimo.it/wisdom

2 WISDOM D0.P1 – Integrated System Protoype 2 Overview studyingdevelopingexperimentingmethods techniquessearching and querying data sources The WISDOM project aims at studying, developing and experimenting methods and techniques for searching and querying data sources available on the Web. The goal of the project: Definition of a software framework that allows computer applications to leverage the huge amount of information contents offered by Web sources (typically, as Web sites)  number of sources of interest might be extremely large  sources are independent and autonomous one each other The context: These factors raise significant issues, in particular because such an information space implies heterogeneities at different levels of abstraction (format, logical, semantics). Providing effective and efficient methods for answering queries in such a scenario is the challenging task of the project

3 WISDOM D0.P1 – Integrated System Protoype 3 Overview In WISDOM, super-peers containing data from web-sources referred to the same domain are built. Super-peers are connected by semantic mappings in a Super- peer Network. The end-user formulates a query according to a specific super-peer. The answer will include data extracted from all the super-peers relevant for the query. From a functional point of view, the WISDOM project may be divided into two parts: A) Building a super-peer Network, where  Web sources are grouped into super-peers;  Each super-peer exports a Semantic Peer Ontology synthesizing the knowledge of the involved sources;  The Semantic peer ontologies are related by means of simple semantic mappings. B) Querying a super-peer Network, where:  A graphical interface allows the user to formulate a query according to a semantic peer ontology;  The query is rewritten for each super-peer interesting for the answer;  The query is reformulated inside each super-peer according to the involved sources  The query is locally executed and the results are provided to the user.

4 WISDOM D0.P1 – Integrated System Protoype 4 1) Data from web-sites are extracted by means of wrappers 2) Data are annotated according to a lexical reference 3) A semantic peer ontology is created for each semantic peer 4) The semantic peer ontologies are related by means of mappings Functional Flow-Diagram 1)http://dbgroup.unimo.it/wisdom/prototipi/D1.P4.html 2)http://dbgroup.unimo.it/wisdom/prototipi/D1.P5.html 3)http://dbgroup.unimo.it/wisdom/prototipi/D1.P1.html 4)http://dbgroup.unimo.it/wisdom/prototipi/D2.P1.html

5 WISDOM D0.P1 – Integrated System Protoype 5 Building a Super-peer Network: the local sources We tested the system by creating with MOMIS (http://dbgroup.unimo.it/Momis), Road Runner (http://www.dia.uniroma3.it/db/roadRunner) and MELIS (http://dbgroup.unimo.it/wisdom-unimo/melis), a Super-peer Network composed of three peers, each one integrating 2-3 tourism Web sites: Peer 1 bbitaly http://www.bbitalia.it/default_eng.asp touring http://www.touring.itPeer2 guidacampeggi http://www.guidacampeggi.com saperviaggiare http://www.touringclub.com/ITA/viaggiatori/dove_mangiare venere.com http://www.venere.com Peer 3 bedandbreakfast http://www.bed-and-breakfast.it booking http://www.booking.com opificidigitali http://www.opificidigitali.it …

6 WISDOM D0.P1 – Integrated System Protoype 6 Demonstration Scenario Peer 1 Peer 1 abstracts in the global classes: hotels restaurants the local classes: hotels (bbitaly) restaurants (touring)

7 WISDOM D0.P1 – Integrated System Protoype 7 Demonstration Scenario Peer 2 Peer 2 abstracts in the global classes: hotels campings facilities the local classes: hotels (venere) hotels (saperviaggiare) maps (venere) campings (guidacampeggi) facilities (guidacampeggi) facilities (venere)

8 WISDOM D0.P1 – Integrated System Protoype 8 Demonstration Scenario Peer 3 Peer 3 abstracts in the global classes: hotels restaurants features the local classes: hotels (booking) judgement_hotel (booking) bedandbreakfast (bedandbreakfast) features (bedandbreakfast) restaurants (opificidigitali) features_bb (bedandbreakfast) conditions_hotel (booking)

9 WISDOM D0.P1 – Integrated System Protoype 9 Each Super-peer is created extracting data by Road Runner: 1. Identify sources (Web sites) to be wrapped 2. For each source: infer a site schema and collect pages containing information of interest 3. From a set of sample pages, infer a wrapper library 4. Apply the wrapper library over the set(s) of pages collected in step 2 Wrapping Web Sources 1. Identify sample pages Wrapper library Output data 2 Collect pages similar in structure to those of the sample set 3 Generate a wrapper library 4 Apply the wrapper library to extract data from pages Wrapper generator All/InDesit: infer site schema

10 WISDOM D0.P1 – Integrated System Protoype 10 The demonstration:  11 web sites, delivering information about hotels, campings, b&bs, restaurants  The Web site schema inference module (Indesit) was configured (when possible) to collect pages of interest from these sites  Indesit generated a Web schema for each Web site: the output description was used to collect pages about 8000 pages  16 wrappers were inferred by means of the wrapper generation module RoadRunner  The extracted data were stored in 11 relational databases (one per source)  The Indesit Web schemas can be used to refresh data Wrapping Web Sources: the Demo

11 WISDOM D0.P1 – Integrated System Protoype 11 Annotating Data Sources wrt a lexical reference MELIS: Meaning Elicitation and Lexical Integration System

12 WISDOM D0.P1 – Integrated System Protoype 12 Annotating Data Sources wrt a lexical reference Building Hotel Name City Restaurant Home page Building #1 Restaurant #2 Home page #1 City #1 Hotel #1 Name #2 City #1 B&B #1 Name #2Address Edifice #2 Motel #1 Name #2 City #1 Address #3 Domain Ontology Input Ontology Meaning Elicitation Process For each (class and property) element in the Input Ontology, MELIS extracts all candidate senses from WordNet. After this step it filters out candidate senses by using Domain Ontologies and a collection of heuristic rules.

13 WISDOM D0.P1 – Integrated System Protoype 13 Annotating Data Sources wrt a lexical reference bed_and_breakfast BB.bed_and_breakfast.url bed_and_breakfast 1 OUTPUT Language: WISDOM-OWL OWL DL DB Annotations Lexical Annotations

14 WISDOM D0.P1 – Integrated System Protoype 14 Building a Super-Peer Ontology Super-peer Ontologies were built by means of the MOMIS system, extended for the specific purposes of the project. In particular, techniques for adding/removing sources to/from a created ontology without restarting the process from scratch were introduced. The MOMIS process for building a domain ontology is based on the following steps:

15 WISDOM D0.P1 – Integrated System Protoype 15 Building Peer 2 Peer 2 was created integrating the local sources venere (493 hotels), saperviaggiare (977 hotels) and guidacampeggi (183 campings) venere Source venere local classes: hotels maps facilities

16 WISDOM D0.P1 – Integrated System Protoype 16 guidacampeggi Source guidacampeggi local classes: campings facilities saperviaggiare Source saperviaggiare local class: hotels Building Peer 2

17 WISDOM D0.P1 – Integrated System Protoype 17 Peer 2 Ontology

18 WISDOM D0.P1 – Integrated System Protoype 18 Creating inter-peer mappings Hotels #1 Name #2 City #1 Hotel #1 Name #2 Address #3 Hotel #1 Name #2 PEER 2 PEER 1 PEER 3 URL #1 City #1 Address #3 Logo #1 Price #4 Address #3 Phone Number #1 e-mail #1 URL #1 Hotel #1 BBItaly Hotel #1 Booking Venere Hotel #1 For each node of the Peers that are compatible with other elements of other Peers we create a Mapping Element that describe the relationship between them

19 WISDOM D0.P1 – Integrated System Protoype 19 Creating inter-peer mappings mappingElement#3 Datatype2Datatype equivalent 1 Peer1.hotels.address address address#9#0#C address#9#0#C wordnet21 107938889 0.0 1.0 0.0 0.0 OUTPUT Mappings:

20 WISDOM D0.P1 – Integrated System Protoype 20 Querying a Super-Peer Network Querying the Super-Peer Network involves:  Formulating the query at a peer on the ontology local to that peer  Rewriting the query according to neighboring peers’ ontologies using the semantic mappings  Selecting the peers that are more relevant to the query (using content summaries and semantic information about the rewritings)  Sending the rewritten query to the relevant neighboring peers  Translating the query to execute it on the local sources  This involves relaxing preference expressions that are not directly manageable by the underlying query executor  Executing the query locally (by translating it using the local mappings) and sending the results to the local query processor  Collecting the results from both the local sources and the neighboring peers  Building the final result  This may involve performing additional computations to enforce the original preference relation that was relaxed to be performed locally  Presenting the result to the user

21 WISDOM D0.P1 – Integrated System Protoype 21 Functional Flow-Diagram GUI (M-FIRE) Semantic Parser Q Query Processor q L0L0 GVV 0 q 0 (MOMIS Query Manager) q0q0 qjqj (QP of) peer p j At peer p 0, the user formulates a request using the M-FIRE interface, that produces a query Q in a SPARQL-like syntax. The semantic parser translates the query in an internal format to be easily manipulated by the query processor. The QP rewrites the query with respect to the local ontology (to send the query to the MOMIS Query Manager) or to a remote ontology (to be sent to the QP of a neighboring peer).

22 WISDOM D0.P1 – Integrated System Protoype 22 The user interface prototype The query formulation prototype implements the M-FIRE framework and includes two components: a client component for visual rendering and user interaction handling, and a server component implementing the M-FIRE representation and navigation engines.  Two alternative ways of representing the ontology schema  One single way of formulating conjunctive queries on the ontology schema  One single table-like view of the query results Metaphors M-FIRE allows to declaratevely define how a give RDF document shall be graphically represented by supplying metaphors as parameters to a generic representation engine. Metaphors also determine how the user’s actions on the delivered representation shall be translated into queries over the underlying knowledge base. Our prototype includes two metaphors, featuring: Query Formulation with M-FIRE

23 WISDOM D0.P1 – Integrated System Protoype 23 Ontology schema representation After selecting a knowledge base to be explored and a metaphor for its presentation, the user is provided with a representation of the ontology schema. This is how the two alternative metaphors represent a schema: Query Formulation with M-FIRE  Classes are rendered as tables, where a left pane is showing an intuitive icon for the class, and the right pane is listing the set of properties which apply to that class.  Each datatype property has a light yellow background and a black font  Each object property has a yellow background and a dark red font; moreover, on the left of the property name, an icon is shown for each class which is in the range of the property  Classes are rendered as tables, where the upper pane is showing an intuitive icon for the class on the left of the class name, and the lower pane is listing the properties which apply to that class  Each datatype property has a cyan background, is written in italic and has a black font; on the right side, the name of each class in the property range is shown  Each object property has a light cyan background and a blue font; moreover, on the right side of the property name, the name of each class in the property range is shown

24 WISDOM D0.P1 – Integrated System Protoype 24 Query formulation Query Formulation with M-FIRE Conjunctive queries are formulated by left-clicking or right-clicking on the properties to be included in the result (projection) or for which filters are to be specified (selection)  A left click on a datatype property selects that property for output (object property cannot be selected for output)  A left click on an object property means that the clicked property will be used to perform a join between two classes (each class can only participate in one join with each other class)  A right click on a datatype property allows to express a filter on that property (the operator for comparison and the target value may be specified through a proper dialog box)

25 WISDOM D0.P1 – Integrated System Protoype 25 Input to the Query Processor Query with joins are formulated similarly to queries without joins, by specifying the correct path between join properties through mouse clicks. The query is finally passed to the local query processor.

26 WISDOM D0.P1 – Integrated System Protoype 26 Query Processor Components Internally, the QP ranks the rewritten queries in order to only execute the “best” ones. Then, the optimizer produces the actual queries that will be sent to the local executor (e.g., by “relaxing” some preferences) or to neighboring peers Rewriter Ranker Semantic Mappings Content Summaries Peers’ Metadata rq 1 rq k Optimizer Plan Tree j q j,0 q j,i q j,N rq 1 rq M q (internal form) Semantic Parser Q (SPARQL-like) Ont 0

27 WISDOM D0.P1 – Integrated System Protoype 27 Inter-Peer Query Rewriting RelationGrade: RelationGrade: measures the similarity among the corresponding elements Mapping extension with scores

28 WISDOM D0.P1 – Integrated System Protoype 28 Inter-Peer Query Rewriting Query reformulation example BASE SELECT ?N ?A ?C ?P ?S FROM Peer2 WHERE {Peer2.hotels Peer2.hotels.name ?N ; Peer2.hotels.address ?A ; Peer2.hotels.city ?C ; Peer2.hotels.price ?P ; Peer2.hotels.url ?HURL. Peer2.services Peer2.services.faciltity ?S ; Peer2.services.url ?SURL. FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80) && (?S = 'air conditioning')). } PREFERRING min(?P) LIMIT 50 BASE SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_single ?P ; Peer3.hotels.url ?HURL. Peer3.features Peer2.features.url ?SURL. FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50 BASE SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_double ?P ; Peer3.hotels.url ?HURL. Peer3.features Peer2.features.url ?SURL. FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50 BASE SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_triple ?P ; Peer3.hotels.url ?HURL. Peer3.features Peer2.features.url ?SURL. FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50 RewriterRewriter UNION Source: Superpeer2 Target: Superpeer3

29 WISDOM D0.P1 – Integrated System Protoype 29 The WISDOM Project Query reformulation: details of target output Target schema 1st rewritten query 1st query score & percentages 1st query terms rewriting details Global union score

30 WISDOM D0.P1 – Integrated System Protoype 30 Inter-Peer Query Rewriting The Ranker component of the query processor ranks all the available rewritings (according to user preferences) in order to only execute the “best” ones (e.g., in order to maximize the number of retrieved results, or the semantic similarity of the rewritten query wrt the original one)

31 WISDOM D0.P1 – Integrated System Protoype 31 Inter-Peer Query Forwarding The QP of the neighboring peer receives the query, solves it locally and (possibly) forwards it to its neighbors (care is taken to ensure that each peer only receives a query once).

32 WISDOM D0.P1 – Integrated System Protoype 32 Query Reformulation for Local Execution The Optimizer component of the QP translates the rewritten query in SQL (e.g., preference expressions are relaxed into ORDER BY clauses). Finally, the Executor sends the query to the local query executor (MOMIS Query Manager), waiting for the results.

33 WISDOM D0.P1 – Integrated System Protoype 33 The MOMIS Query Manager reformulates the query taking into account the intra- peer mappings defined in a semantic peer among the local classes and the global classes of the GVV (Global Virtual View). The mappings are defined by using a GAV (Global as View) approach: each global class of the GVV is expressed by means of the full-disjunction operator over the local classes. Local Query Execution Query rewriting Query rewriting  GAV approach:  GAV approach: the query is processed by means of unfolding Fusion and Reconciliation Fusion and Reconciliation of the local answers into the global answer  Object Identification  Object Identification : Join conditions among local classes  Inconsistencies:  Inconsistencies: Resolution functions to deal with conflits

34 WISDOM D0.P1 – Integrated System Protoype 34 Query rewriting and execution Global Virtual View (GVV) query q 0 = scqG1  scqG2 HotelsServices HotelsServices single class query scqG1 single class query scqG2 L3scqG1L1scqG2L2scqG1L1scqG1L2scqG2 VENERE hotelsfacilitiesmap_hotels Local Schema SAPERVIAGGIARE hotels Local Schema GUIDACAMPEGGI facilities map_hotelshotelsfacilities hotelsfacilities Local Schema Query execution on the local sources

35 WISDOM D0.P1 – Integrated System Protoype 35 Query rewriting SELECT H.name, H.address, H.city, H.price, S.facility, S.structure_name, S.structure_city FROM hotels as H, services as S WHERE H.city = S.structure_city and H.name = S.structure_name and H.city = 'rimini‘ and H.price > 50 and H.price < 80 and S.facility = 'air conditioning' order by H.price q 0 = SELECT H.name, H.address, H.city, H.price FROM hotels as H WHERE (H.city = 'rimini' ) and (H.price > 50) and (H.price < 80) scqG1 = SELECT S.facility, S.structure_name, S.structure_city FROM services as S WHERE (S.facility = 'air conditioning') scqG2 = SELECT hotels.name2, hotels.address, hotels.price, hotels.city FROM hotels WHERE ((city) = ('rimini') and ((price) > (50) and (price) < (80))) L3scqG1 = SELECT maps_hotels.hotels_name2, maps_hotels.hotels_city FROM maps_hotels WHERE (hotels_city) = ('rimini') L2scqG1 = SELECT hotels.name, hotels.address, hotels.city FROM hotels WHERE (city) = ('rimini') L1scqG1 = SELECT facilities_hotels.hotel_name2, facilities_hotels.hotels_city, facilities_hotels.facility FROM facilities_hotels WHERE (facility) = ('air conditioning') L1scqG2 = SELECT facilities_campings.campings_name, facilities_campings.campings_city, facilities_campings.name FROM facilities_campings WHERE (name) = ('air conditioning') L2scqG2 = SINGLE CLASS QUERIES UNFOLDING

36 WISDOM D0.P1 – Integrated System Protoype 36 Fusion and Reconciliation L3scqG1 result set L1scqG2 result set L2scqG1 result set L1scqG1 result set L2scqG2 result set VENERE SAPERVIAGGIAREGUIDACAMPEGGI partial results full join full join L1scqG1 full join L2scqG1 full join L3scqG1 full join L1scqG2 full join L2scqG2 scqG1 result set scqG2 result set q 0 result set join scqG1 join scqG2

37 WISDOM D0.P1 – Integrated System Protoype 37 Fusion and Reconciliation scqG2 result set= L1scqG2 full join L2scqG2 guidacampeggi.facilities full outer join venere.facilities on ( (venere.facilities.facility) = (guidacampeggi.facilities.name) AND (venere.facilities.hotels_city) = (guidacampeggi.facilities.campings_city) AND (venere.facilities.hotel_name2) = (guidacampeggi.facilities.campings_name)) scqG1 result set = L1scqG1 full join L2scqG1 full join L3scqG1 saperviaggiare.hotels full outer join venereEn.hotels on ( ((venereEn.hotels.name2) = (saperviaggiare.hotels.name) AND (venereEn.hotels.city) = (saperviaggiare.hotels.city))) full outer join venereEn.maps_hotels on ( ((venereEn.maps_hotels.hotels_name2) = (saperviaggiare.hotels.name) AND (venereEn.maps_hotels.hotels_city) = (saperviaggiare.hotels.city)) OR ((venereEn.maps_hotels.hotels_name2) = (venereEn.hotels.name2) AND (venereEn.maps_hotels.hotels_city) = (venereEn.hotels.city))) SELECT H.name, H.address, H.city, H.price, S.facility, S.structure_name, S.structure_city FROM hotels as H, facilities as S WHERE (H.city = S.structure_city ) AND (H.name = S.structure_name ) ORDER BY H.price ASC q 0 = scqG1 result set join scqG2 result set

38 WISDOM D0.P1 – Integrated System Protoype 38 Local Query Execution The MOMIS Query Manager at work

39 WISDOM D0.P1 – Integrated System Protoype 39 Building the Final Result Local results are forwarded by MOMIS to the query processor. The Executor component also retrieves results from neighboring peers, computes the overall result by taking into account the original user preferences, and forwards it to the M-fire interface.

40 WISDOM D0.P1 – Integrated System Protoype 40 Showing Results in M-FIRE Results are finally shown in M-fire using a table-based form:  Solutions are listed vertically, each one with its own table  For each solution, variable bindings are listed vertically  For each binding a row is provided, where the property name corresponding to the binding variable is shown on the right side, and the (literal) value is shown on the left side


Download ppt "WISDOM D0.P1 – Integrated System Protoype 1 WISDOM (Web Intelligent Search based on DOMain ontologies): Demo Sonia BergamaschiPaolo BouquetPaolo Ciaccia."

Similar presentations


Ads by Google