Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer,

Similar presentations


Presentation on theme: "©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer,"— Presentation transcript:

1 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL)

2 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Outline Problem statement / motivations Model Intrinsic criteria Extrinsic criteria Case study Conclusions

3 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Problem (1) Swissprot site at Geneva A lab at MIT A lab in Trondheim organism Query posted at EPFL species EMBLChange site at Cambridge organism EMBLChange peers species, … SwissProt peers authors, titles, organism, … other peers authors, …

4 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Problem (2) How to obtain semantic interoperability among heterogeneous data sources without relying on pre-existing, global semantic models?

5 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Motivations Querying the semantic web –Large number of data / meta-data sources –Heterogeneous and decentralized by essence Federating loosely-coupled specialized databases –Pre-existing schemas –No query propagation Introducing meta-data support in modern P2P applications –Complete decentralization and auto-organization –Nonsense to impose a global schema

6 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Outline of the solution A lab in Trondheim species EMBLChange site at Cambridge Swissprot site at Geneva A lab at MIT organism Query posted at EPFL organism EMBLChange peers species, … SwissProt peers authors, titles, organism, … other peers authors, … organism  authors organism  species species  organism Local translations enabling global agreements

7 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Query Forwarding To whom shall we send the queries? –To peers susceptible of sending us a response… Simplistic solutions –Local Neighboring (same schema) Low recall –Query Flooding (entire network) Low precision, high network load Semantic Gossiping –Query forwarding by selecting the right peers –Query dependant PHBs (Per-Hop Behaviors) –Query / transformed queries analysis Intrinsic measures (syntactic distances) Extrinsic measures (semantic distances)

8 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Model Network of peers Identical (Gnutella model) / related schemas Queries Translations between some schemas

9 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab On Translations

10 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Syntactic Similarity Measure the degree of similarity between the original and the transformed queries Based on the number of attributes lost during translations Takes into account the importance of the attributes and the selectivity of the predicates Two different values: projection / selection Can be computed iteratively, i.e. for every transformation step Results kept on an attribute-basis in a feature vector

11 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Semantic Similarity The syntactic similarity always assumes that the translations are semantically correct –This is usually not the case in reality We define extrinsic measures for the semantic quality of the translations –Query cycles analysis –Results analysis Semantics as an agreement among peers

12 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Cycle Analysis Query cycle detection Based on the original query it forwarded and on the returning query it received, p 1 may assess the correctness of the translation T 1->2

13 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Cycle Analysis (2) What happened to an attribute A i present in the original query? –(T 1->n ) (A i ) = (A i )  (positive cycle) –(T 1->n ) (A i ) = (A j ) X (negative cycle) –(T 1->n ) (A i ) =  - (no semantic information) Probabilistic analysis based on the positive and negative feedback we receive   s : probability of p 1 ’s translation being incorrect   f : probability of another translation being incorrect   : probability of two errors being compensated somehow

14 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Cycle Analysis (3) Combining different probabilities, we have the likelihood l() of receiving a series of feedbacks, some positive, other negative No assumption on error probabilities Prob. of p 1 ’s translation being correct (we could take into account density functions here) is For every outgoing link, peers keep a corresponding semantic feature-vector Similar method for result analysis (association rule mining)

15 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab To Gossip or not We compute similarity measures S i based on –the four feature vectors –the weights (relative importance) of the attributes W We forward the query using a translation link if where S min i are query-defined thresholds for the different similarity values  queries are only forwarded to peers susceptible of understanding them syntactically and semantically

16 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Case study (1) First tests performed inside the lab –7 nodes (semantic domains) –21 translation links (some erroneous) Test-bed uses –Java-based query processor and translator (based on IPSI XQuery processor) –P2P app. (JXTA) for creating and exporting schema + data Initial results only

17 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Case study (2) Query = FOR $project IN “project_A.xml”/* RETURN $Project/title Analysis for A (thresholds set to 0.5): : C has no representation for the title –C will never receive the query : –Translation link A to B : 0.79 –Translation link A to C : 1 (cf. syntactic analysis) –Translation link A to D : 0.26

18 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Conclusions Semantic interoperability in a bottom-up, decentralized manner Global agreement from purely local interactions Relies on local translations between the different schemas Semantics viewed as an agreement 2-dimensional analysis: –intrinsic (syntax) –extrinsic (semantics) Results used to determine whether or not it is useful to forward a query to a certain group

19 ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 lsirwww.epfl.ch www.p-grid.org


Download ppt "©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer,"

Similar presentations


Ads by Google