Presentation is loading. Please wait.

Presentation is loading. Please wait.

What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon.

Similar presentations


Presentation on theme: "What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon."— Presentation transcript:

1 What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon BBN Technologies 2 University of Delaware 3 Rensselaer Polytechnic Institute Joint Large and Heterogeneous Data and Quantified Formalization Workshop (LHD+SemQuant 2012) Joint Large and Heterogeneous Data and Quantified Formalization Workshop (LHD+SemQuant 2012) Boston, Massachusetts 12 November 2012 1 Copyright 2012 Raytheon BBN Technologies

2 Outline Problem Our Solution Future Work 2 Copyright 2012 Raytheon BBN Technologies

3 Problem Transfer a knowledge base in a constrained or intermittent communication environment –Tactical military –Large football game or conference Send the most important information first –Prioritize statements based on their utility Account for inference –No need to transfer inferred statements 3 KB Copyright 2012 Raytheon BBN Technologies

4 Utility The utility of a statement can be calculated by a preference function U(S, s) where S is the set of statements in a knowledge base and s ∈ S Somewhat arbitrarily –Utility ranges from 0 to 1 –The total utility of all statements in S should equal 1 4 Copyright 2012 Raytheon BBN Technologies

5 Preference Functions Ideally, users would provide a preference function suitable for a given context –Difficult to extract or derive Need a default preference function when nothing more specific is available We selected inverse frequency as the default Motivations –Surprise in previous research on Semantic Information Theory –Term frequency-inverse document frequency in Information Retrieval systems 5 Copyright 2012 Raytheon BBN Technologies

6 RDF Utility We consider each URI and literal to be a symbol We compute the utility of a statement by averaging the inverse frequencies of its subject, predicate, and object components and then normalizing the results 6 Copyright 2012 Raytheon BBN Technologies

7 Inference Statements can be used to infer other statements –We want to quantify this by computing the inference contribution of each of these statements Statements can have different utilities in different KBs –We’re particularly interested in the initial (ground) KB and its deductive closure The total inference contribution is 1 – the utility of each of the ground statements in the deductive closure 7 Copyright 2012 Raytheon BBN Technologies

8 Framework An experiment consists of –A set of statements (KB) –An inference procedure – we used RDF Schema –A preference function – we used inverse frequency –A statement ranking function, which uses various computed values Implemented using Jena, its rule based reasoner, and its Derivation interface We accumulate utility in the deductive closure as statements are transmitted and inferred –Generate a transcript and a cumulative utility graph –An experiment can be summarized by its average cumulative utility 8 Copyright 2012 Raytheon BBN Technologies

9 Data Sets POTUS – Wikipedia information about Presidents of the United States FOAF – My FOAF profile + FOAF vocabulary Cascade – Discussed later Data sets and code are available at http://asio.bbn.com/2012/04/utility/ http://asio.bbn.com/2012/04/utility/ 9 Copyright 2012 Raytheon BBN Technologies

10 Ranking Gold standard: Ranking by utility in the deductive closure + inference contribution Inference contribution is rather difficult and expensive to compute –Most reasoners provide 1 justification, not all Also tried several heuristics –Utility in the initial KB –Utility in the deductive closure –Random T box, then random A box Base case: random 10 Copyright 2012 Raytheon BBN Technologies

11 Results of Different Ranking Functions Cumulative utility for 262 statements in the POTUS data set 11 Copyright 2012 Raytheon BBN Technologies

12 Observations We can effectively order statements to increase or maximize average cumulative utility Using inverse frequency –Inferred RDFS statements are of lower utility –Matches intuitions and practice regarding rdf:Resource, etc. Ranking based on simpler heuristics appears promising –More research is needed 12 Copyright 2012 Raytheon BBN Technologies

13 Cascade Data Set @prefix rdf:. @prefix rdfs:. @prefix :. :A rdfs:subClassOf :B. :B rdfs:subClassOf :C. :C rdfs:subClassOf :D. :D rdfs:subClassOf :E. :a rdf:type :A. Possible to analyze all 5! = 120 possible permutations What order do you think is best? 13 Copyright 2012 Raytheon BBN Technologies

14 Cascade Data Set (2) Average Cumulative Utility for all 120 permutations of cascade statements 14 Copyright 2012 Raytheon BBN Technologies

15 Cascade Data Set (3) Statements 0. :D rdfs:subClassOf :E. 1.:B rdfs:subClassOf :C. 2.:C rdfs:subClassOf :D. 3.:a rdf:type :A. 4.:A rdfs:subClassOf :B. Best results: average cumulative utility.639 –01423 –04123 –04213 –10423 –24013 –40123 –40213 –42013 15 Copyright 2012 Raytheon BBN Technologies

16 Contributions Introducing utility into the Semantic Web Quantifying inference A new problem An evaluation framework 16 Copyright 2012 Raytheon BBN Technologies

17 Future Directions Incorporating user-defined preferences Employing more sophisticated inference (e.g. OWL RL) Working with (much) larger data sets Generalizing our framework into a toolkit Considering bits required to encode messages Addressing multi-party situations with different preference functions Modeling information fusion 17 Copyright 2012 Raytheon BBN Technologies


Download ppt "What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon."

Similar presentations


Ads by Google