Presentation is loading. Please wait.

Presentation is loading. Please wait.

2008 © Martin Dzbor, 34th SofSem Conf., Slovakia Best of Both “Using Semantic Web Technologies to Enrich User Interaction with the Web, and Vice-Versa”

Similar presentations


Presentation on theme: "2008 © Martin Dzbor, 34th SofSem Conf., Slovakia Best of Both “Using Semantic Web Technologies to Enrich User Interaction with the Web, and Vice-Versa”"— Presentation transcript:

1 2008 © Martin Dzbor, 34th SofSem Conf., Slovakia Best of Both “Using Semantic Web Technologies to Enrich User Interaction with the Web, and Vice-Versa” Martin Dzbor Knowledge Media Institute, The Open University (UK)

2 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 2 Outline  Motivation, gaps in current tools  Value for the users Exposing implicit semantics of legacy and public data  Taking advantage of (semantic) data redundancy Revyu.com case: linking open data project Watson case: gateway capable of analyzing and finding SW data PowerMagpie case: bringing implicit semantics to the user User interaction case: revisiting familiar GUI-s with semantics  Wrap up, next generation semantic web tools

3 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 3 The Web and Meaning  Size of the information pool can be intimidating 2000: 7 million unique sites [OCLC report] 2005: billion documents [Gulli, Signorini + Yahoo!] 2007: ~30 billion pages + 1 billion users [Netcraft report]  It’s not the pages that carry the bulk of meaning Number of facts and assertion is many times larger Effect of large and complex systems applies  Meaning of a page ≠ sum of meanings of embedded facts  Even more meaning is in links Links and relations pose substantial challenges

4 Publications Sources Centres Projects co-occur Languages Atomic Concepts Authors Technologies OWL XML OWL Markup_Lang ‘Class’ represents a collection of entities co-occur publish co-occur Research Issues Semantic View discussed_in situated_in expert_in implemented_in has_src relates_to criticizes coauthor investigates has_key active_in researched_by xyz:Author foaf:Person abc:Institute abc:University skos:Document dolce:Activity … … prj:Task … skos:Language xyz:Relation dolce:Activity … …

5 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 5 Meaning and Interpretation  Constructivist view of knowledge on the Web “Most of our intelligent behaviour relies on the capability to see and make connections.” [Vannevar Bush, 1946] Yet connections are subjective  Meaning on the Web thus arises in the eyes of user, reader  Fact-based knowledge retrieval is not necessarily matching the established meaning A document with terms ‘truth’, ‘holocaust’  But not ‘the truth’

6 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 6  Human task is typically much more than a query Few activities we carry out can be directly & uniquely translated to (formal) queries… Often, multiple queries need to be connected & data from them interpreted, contextualized…  Interpretations are often imprecise Queries are hard to (re)formulate & expand by the users  Where Semantic Web can help Embed initial queries into potential exploratory paths  Don’t just respond to the queries  Give alternatives, suggest what next/else can be done, and why/why not Beyond Retrieval Queries

7 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 7 Comes Semantic Web  “Semantic Web is a Web of data” [Berners-Lee 2001] Actually, of connected and exposed data… [Altova.com] Ideally, of interchangeable data… [W3C SW Activity]  Where is the meaning? Interchange enabled by committing data/facts to the same thing  Our interests Expose connections and commitments also in places where they are so far implicit and hidden,… …using the existing web content as an enabling asset rather than something intimidating,… …to support ordinary users in exploring and effectively making sense of this vast information space.

8 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 8 From Motivation to Strategy  Key differences from other similar work: Limited manual handcrafting: aiming for scale and automated KA Designing for users: not only support knowledge sharing but also doing something, using that knowledge  Software development approach Based on formative evaluations (w/real users) Tapping into legacy data sources, often DB-s (e.g. DBLP) Using information extraction techniques to enrich and validate potential meanings of gleaned data (e.g. Corder, ExpertSearch) Exposing the semantics of (hypothesized and validated) relations rather than individual ‘tags’

9 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 9 Sample Findings from One Study  Product: ASPL [KnowledgeWeb, Magpie projects] Web-based platform and plug-in to support learners on the Web to perform knowledge-intensive analyses in a given domain  Some key findings we had to address: Users not keen on ‘declarative semantics’  e.g. showing query results is insufficient; users wanted to know what (else) can be done with results, how to use them to learn something Users (esp. more experienced) expect the semantic system to ‘know the domain’  e.g. individuals are often better characterized by the research communities they belong to, by abstraction Resource finding and retrieval are not ‘selling points’  e.g. there are tools finding information more efficiently (e.g. Google);  value is in supporting exploratory, interactive and customizable interaction

10  Example: generalizations and abstractions Interpret aggregations over a simple property in DBLP to formalize semantically richer relationships; e.g.:  Research community membership,  Expertise and leadership in a particular research area, etc. Based on information retrieval but the automated composition of partial findings provides richer means to navigate/explore Opportunity to conceptualize results as new knowledge assertions Going beyond mere retrieval/search …   Example: generalizations and abstractions Interpret aggregations over a simple property in DBLP to formalize semantically richer relationships; e.g.:  Research community membership,  Expertise and leadership in a particular research area, etc. Based on information retrieval but the automated composition of partial findings provides richer means to navigate/explore Opportunity to conceptualize results as new knowledge assertions Going beyond mere retrieval/search … 

11 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 11 Positioning Semantic Tools Semantic relevance Popularity, statistics, etc. None Full automationManual + interactive Automated + user choice Masque Blinkx Google TextDigger Precise AquaLog AskNow Ask Hakia Ilqua ASPL/DBLP++ Automatically embedded explanations Explanation upon request Explanation not present Ordering Classification + clustering Visual clustering + labelling Summaries Masque Blinkx Google TextDigger Precise AquaLog AskNow Ask Hakia Ilqua ASPL/DBLP++ Position Chart 1: Result ranking function (sources, style) Position Chart 1: Result ranking function (sources, style) Position Chart 2: Explanatory function in post processing (technique, style) Position Chart 2: Explanatory function in post processing (technique, style) Full support Partial support Limited use Keywords Phrases NL sentences Examples Masque Blinkx Google TextDigger Precise AquaLog AskNow Ask Hakia Ilqua ASPL/DBLP++ Position Chart 3: Query formulation (support, means) Position Chart 3: Query formulation (support, means)

12 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 12 Sample task: Expertise Query: research topic Service: domain interpretation Query modifiers: filter, scope

13 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 13 Rich GUI to ASPL content Faceted DBLP (http://dblp.l3s.de) Facets shown correspond to the same data as presented by ASPL in different screens/services; here they also act as query modifiers Data record enables further navigation as a means to query refinement and actual data access (e.g. PDF, BibTeX, DOI)

14 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 14 Modularity + Semantics  Sustainability  All produced user end points remain after the end of the project that funded their development Hosted by their institutions and/or ongoing projects REASE and other RDF content is interesting for W3C SWEO for education and outreach purposes  ASPL functionalities continue to be extended E.g. in the context of an independent collaboration between OU and FAO’s Knowledge Systems Division Essentially, the entire ASPL ‘pipework’ can be reused, only the domain ontology has to reflect FAO needs Further use of the ASPL technology currently explored (e.g. bioinformatics, biological pathogens, etc.)

15  Magpie/ASPL in practice annotations equal to user choosing ontological view FAO’s Agrovoc layered over an (arbitrary) web page semantic browsing intertwined with ‘classic’ browsing and showed as ‘taggings’  Magpie/ASPL in practice annotations equal to user choosing ontological view FAO’s Agrovoc layered over an (arbitrary) web page semantic browsing intertwined with ‘classic’ browsing and showed as ‘taggings’ Semantic proximity linkWeb link

16 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 16 Semantics of Redundancy  Reusing (often non-semantic) data to produce semantic annotations and interchangeability Previous examples reuse non-semantic data (SQL DB) While data content in DB is not semantic, certain composite queries have a well-defined semantic interpretation Hence, such queries act as if they were feeding semantic annotations onto the singular Web resources  More importantly, this approach to exposing semantics takes advantage of the Web nature Information on the Web is captured redundantly Law of big numbers & statistical correlations

17 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 17 Same Idea in Linking Open Data Some information is captured redundantly in several places  Use it as a standard ‘JOIN’ in SQL queries… Say that the two statements are about the same thing… …which gives an access to additional information from (e.g.) specialized data sets Handy in:  eliminating the eternal bane of data sharing = form filling  seeding the data input forms with ‘obvious knowledge’ Source:

18 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 18 Revyu.com in Linking Open Data Linked data gleaned from (e.g.) Amazon User’s data from a minimalist data form Expressing the sameness in formal RDF Returning the mash-up back (to the Web) Semantic data disguised into a folksonomy

19 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 19 Novelty vs. Familiarity  Novelty of technologies like Semantic Web has drawbacks Hard to sustain over longer period Creates resistance to the proposed change  Technology (Semantic Web) is not the sole new thing the user has to cope with! Many tools assume new user roles Many tools assume new interaction modalities  Try a different view… Instead of pushing new technology, we need to improve the overall use experience ‘How can Semantic Web make task I’m doing different = easier, faster, simpler,…’

20 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 20 Getting Hold of Semantic Web NG SW ApplicationSemantic WebSmart Feature  New applications need to exploit SW at large Dynamically retrieving relevant semantic data Combining several, heterogeneous models (ontologies)  Need tools and infrastructures to efficiently access the knowledge available on SW: a Gateway…

21 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 21 Why Gateway?  Functionality beyond discovering, indexing, and retrieving is necessary Because of heterogeneity in terms of data quality… Because of heterogeneity in terms of data coverage… Because of a substantial degree of knowledge duplicity…  Watson Case:

22 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 22 Analyzing Semantic Content  Have to deal with heterogeneity; great variety in: Size Coverage Richness Etc.

23 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 23 Watson Architecture Keyword Search SPARQL Query Crawling Parsing (Jena) Validation/ Analysis Indexing RepositoryURLsMetadataIndexes populate use extract retrieve Ontology Exploration queries request WWWWWW discover CollectingAnalyzing Querying

24 Watson Web User Interface:

25 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 25 Collecting and Analyzing Knowledge  Web and DB retrieval techniques crawl through data repositories and pages with semantic content  E.g. in October 2007 Watson collected tens of thousands of semantic documents That represents millions of RDF entities (most of them being instances)  Yet, in terms of models… Conceptually ‘same’ data often occur in numerous duplicates and near duplicates Which affects reliability of reasoning

26 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 26 A Gateway to the Semantic Web

27 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 27 Watson & Multiple Ontologies in Use  In rapid (ontology) prototyping and modelling Near-duplicate models of a term (e.g. ‘Human’) Chunk of a model around selected node reused by acknowledging its redundancy Thus, new ontology created from reused fragments of existing (often tested) models: Engineering process is much faster The outcome is a less error-prone model

28 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 28 Watson & Multiple Ontologies in Use  Embedding semantics into ordinary web pages, plain text, and other content currently without it  Purpose Impose a particular interpretative frame onto a web page to bias its interpretation Highlight conceptual entities that are key in a particular context Enable user navigation and browsing in (a part of) Semantic Web knowledge space  Case: my Magpie framework [Dzbor et al in JWS] Usually, I draw people’s attention to what can be done with it But today let’s look at limitations

29 Magpie (Dzbor et al. 2007):

30 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 30 Magpie vs. Semantic Web Browsing  User picks up one ontology to annotate pages Precision:  ; recall:  (?)  Annotated entities carry one meaning only E.g. Virus  Comp_Program  Entity-based approach to annotating only Usually instances like ‘OWL’ as members of categories Visual presentation limited to entity highlighting  System offers a range of ontologies applicable Let users to balance P/R  In reality, it’s useful to see and use also other senses E.g. what if Virus  Organism  Sometimes other views are better in a specific context Concept senses view Ontology network view Topic view, etc.

31 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 31 Towards PowerMagpie Relevant SW content let user select browse via semantic links adapt ‘fingerprint’ improve SW content retrieval enrich by redundancy rank, filter,… customize presentation use ‘fingerprint’ to discover SW content Web Page visualize GUI-s visualize calculate page ‘fingerprint’ Characteristics of the web page send to user User

32 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 32 Selecting Multiple Ontologies core ontology  ontology extension by declaration e.g. Carnivor  Animal & eats.Meat   by inconsistence different ont. frames e.g. YellowFin  AtlanticHabitat  ontology specialization by reference e.g. Albacore  Tuna  analogous ontology by mapping e.g. Contamination  Polution Other relevant semantic data by classification e.g. SoilAcidification  Known irrelevant semantic data by classification e.g. Baltic 

33 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 33 Example: Entities in Ontology Networks  None of the following views is ‘typically ontological’ Entities are presented in a more familiar ‘tag style’ Node positions reflect semantic proximity, similarity, ‘sameness’ These are truly from one ontology: proximity = ontological distance These are collated from multiple sources: proximity = repetitive contextual co-occurrence Source: Cipher project, 2005 Source: NeOn project, 2008

34 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 34 Example: Selecting Ontologies A concept occurring in more than one ontology: redundancy of occurrence A concept occurring in more than one role, sense: redundancy of meaning A concept from topically more distant ontology: divergence into new frame Statistics of the ontology or entity: provenance of information Versions of the same ontology discovered by PowerMagpie: temporality of occurrence  PowerMagpie analyzes a web page, proposes and justifies relevant entities Supporting divergent navigation Supporting time snapshots Acknowledging multiple meanings Exposing redundancy of occurrences Etc.  Next step: improve the user interaction, GUI Making the ontology-driven interaction more serendipitous, natural and embedded in standard browsing

35 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 35 Where is ‘Best of Both’?  Essentially in two contradictory properties: Ontologies are expressive, well-defined But they are fairly sparse in terms of content  It’s the sparseness that makes meaningful browsing and navigation difficult In any single ontology we are merely performing graph/tree navigation = often falling into the ‘closed world assumption’ Semantic Web affords more flexibility  It may not enable us to tell which sense of a term we see  But it is sufficiently connected to enable us telling the difference between the senses: On the level of ontology networks but also individuals

36 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 36 Comparing SW and KBS  Why should it work now if it hasn’t in the past? There are some key changes in the play: Classic KBSSW Systems Representation'Clean''Good Enough' SizeSmall/MediumExtra Huge Repr. SchemaHomogeneousHeterogeneous QualityHighVery Variable Degree of trustHighVery Variable

37 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 37 Key Paradigm Shift Classic KBSSW Systems Intelligence A function of sophisticated, task-centric problem solving A side-effect of size and heterogeneity (Collective Intelligence)  Is due to information and data redundancy There are not only numerous documents - with little formal semantic structure, but also… numerous formal take-ons trying to conceptualize user views

38 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 38 How Far Are We?  Solutions working in a new dynamic context (run- time rather than design-time) Example: Ontology Mapping  So far: mostly design-time mapping of (two) complete ontologies  Mapping many partial, incomplete ontologies, ontological modules? Example: Ontology Selection  So far: largely by querying, user-mediated ontology retrieval  Selecting networks of not contradictory partial ontologies? Example: Ontology Modularization  So far: by and large has the user in the loop, consistency-driven  Many diverse drivers (access right, trust, scale, summarization,…)  The context of the above tasks is changing

39 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 39 Next Generation SW Tools  Make away with singular data sources Incl. ontologies, classification trees, maps,…  For them ontology becomes a dynamic notion Ontology = a selection of modules appropriate to a particular context, situation, user, task,…  New challenges arise Discovering semantic content and relations in it Modularizing large sources of semantic content Selecting (parts of) semantic content or ontologies Support user interaction on such a large scale Etc.

40 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 40 Some Reading  Dzbor, M. - Motta, E. - Domingue, J.B.: Magpie: Experiences in supporting Semantic Web browsing. Journal of Web Semantics, Vol.5, No.3., pp Elsevier Publishers, The Netherlands.  Dzbor, M. - Motta, E.: Semantic Web Technology to Support Learning about the Semantic Web. In 13th Intl. Conf. on Artificial Intelligence in Education (AIED). July 2007, California, US.  Sabou, M. - Lopez, V. - Motta, E. (2006). Ontology Selection for the Real Semantic Web: How to Cover the Queen’s Birthday Dinner?. Proc. of the EKAW 2006 Conf., Podebrady, Czech Republic.  D'Aquin, M., - Sabou, M. - Motta, E. (2006). Modularization: A key for the dynamic selection of relevant knowledge components. ISWC 2006 Workshop on Ontology Modularization, Georgia, US.  Motta, E. (2006). Knowledge Publishing and Access on the Semantic Web: A Socio-Technological Analysis. IEEE Intelligent Systems, Vol.21, No.3, pp IEEE Press, US.

41 23 Jan © Martin Dzbor, 34th SofSem Conf., SlovakiaSlide 41 Acknowledgements and Web Sites  Work presented has been developed in the context of the following projects and activities: Magpie (info, demo, download): PowerMagpie (info, demo, download): Watson (info, UI, API download): NeOn project: OpenKnowledge project: Papers cited and personal pages:  Acknowledging funding from European Commission’s Framework 6 (NeOn & OpenKnowledge), EPSRC and NERC  Also thanks to Laurian Gridinoc, Enrico Motta, Joerg Diederich, etc. for their input to some of the ideas presented


Download ppt "2008 © Martin Dzbor, 34th SofSem Conf., Slovakia Best of Both “Using Semantic Web Technologies to Enrich User Interaction with the Web, and Vice-Versa”"

Similar presentations


Ads by Google