Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ICLP-09 Enabling serendipitous search on the Web of Data using Prolog Jan Wielemaker VU University Amsterdam.

Similar presentations


Presentation on theme: "1 ICLP-09 Enabling serendipitous search on the Web of Data using Prolog Jan Wielemaker VU University Amsterdam."— Presentation transcript:

1 1 ICLP-09 Enabling serendipitous search on the Web of Data using Prolog Jan Wielemaker VU University Amsterdam

2 2 ICLP-09 Issues addressed Recent developments reshaped the Web   The web moved from “Web of documents” to “Web of data” and “Web of applications”   “Open” and “Linked” data makes massive amounts of data available to be processed by machines How can we deploy Prolog in this environment?

3 3 ICLP-09 Overview Introducing the semantic search engine “ClioPatria”; description of the problem it addresses Why (not) use Prolog for semantic web applications? Processing RDF-data Applying Prolog in web-servers Creating interactive web-applications Wrap-up

4 PART I The ClioPatria use-case: Integrate digital collections of multiple museums and connect it to background knowledge

5 Collection and Meta-data Schema Vocabularies

6 6 Background knowledge

7 7 The Web: documents and links URL Web-link (untyped hyperlink)

8 8 The Semantic, or Data Web: data and links URL Web link Painter “Henri Matisse” Getty ULAN creator Dublin Core Painting “Green Stripe (M me Matisse)” Royal Museum of Fine Arts, Copenhagen

9

10

11 … nice graph, but... What about semantics? What about structure?

12 Semantic Web data model: RDF 1 fact = R(O 1, O 2 ) = = 1 “triple” many facts = labelled graph = RDF URIs as identifiers, typed relations between typed objects Has many different syntaxes (XML (W3C), N3, Turtle, graphical, etc). Doesn’t matter: it’s a data model Slide by Frank van Harmelen

13 Semantic Web data model: RDF Schema hierarchy of types, hierarchy of relations, domain/range-constraints simple: no negation, disjunction, universal Slide by Frank van Harmelen

14 Semantic Web data model: OWL and SWRL everything you wanted to say but cannot say in RDF(S) negation, disjunction, cardinality, limited universal, relational algebra (trans, symm) still no composition of relations (DL-based) SWRL: rules with DL concepts as atoms Full DL Lite Slide by Frank van Harmelen

15 15 Structure for thesauri

16 Structure for works of Art

17 From meta-data to semantic meta-data Thesaurus Schema mapping (SKOS) Meta-data Schema mapping (VRA) Thesaurus alignment Meta-data mapping 5 collections → 11,000,000 triples

18 Part of a large cloud of linked data!

19 The challenge How to make use of this network for search? Can we search better? Can we present better?

20 ClioPatria A Prolog web-server with RDF-store Developed to explore this challenge Explore graph using best-first search based on semantic distance Cluster results based on relation to query

21 ClioPatria: “Matisse” “Matisse” in the title “Matisse” in the title Located in “Musee Matisse” Located in “Musee Matisse” Created by “Matisse” Created by “Matisse” Paintings in the same style as used by “Matisse” Paintings in the same style as used by “Matisse”

22 Serendipitous? Serendipity is the effect by which one accidentally discovers something fortunate, especially while looking for something else entirely unrelated (wikipedia). The search is not based on any schema It can find results through unexpected paths It often finds many unintended results (i.e., it answers multiple “graph” queries) This remains manageable due to clustering → “Post-query disambiguation”

23 Serendipitous … “Picasso” Things made from “Picasso marble” Things made from “Picasso marble”

24 ClioPatria fact-sheet Prolog246 files, 67,500 lines Developers3 core, about 10 occasional Triples loaded Used with upto 22,000,000. Scales to 300,000,000 in 64-Gb memory UsageKnown to be in use in 6 projects http://e-culture.multimedian.nl/software/ClioPatria.shtml

25 25 ICLP-09 Part-II Using Prolog for the Semantic Web

26 26 ICLP-09 The neaties vs. the scruffies (DL-)Logic background In search for expressive logics, correct and efficient resolution techniques LP: F-Logic, ASP, ALP, FO(.), … (Marc Denecker) Webby background In search for doing something useful with huge amounts of shallow and inconsistent facts Simple logics, techniques need not be sound, neither complete.

27 27 ICLP-09 Why NOT Prolog? The core-concepts in the Web community are:   Networking   Concurrency   Web-page generation   Internationalization  ... These are typically not associated to Prolog

28 28 ICLP-09 Why Prolog? RDF fits nicely with relational model of Prolog   With a little work it does everything SPARQL can   … but it is much more flexible Most languages in the SW-community can be translated into Horn-clauses:   OWL (large subset)   Rule languages: SWRL, RIF  ...

29 29 ICLP-09 The Semantic Web seen from Prolog Pure predicate rdf/3: rdf(?Subject, ?Predicate, ?Object) is nondet.   URI → Atom   Literal → literal(Atom) literal(lang(Code, Atom)) literal(type(URI, Atom))

30 30 ICLP-09 URI: XML Namespaces Namespaces are expanded at compile-time by means of rules for goal_expansion/2, so rdf(S, 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',http://www.w3.org/1999/02/22-rdf-syntax-ns#type 'http://e-culture.multimedian.nl/ns/getty/ulan#Person'). rdf(S, rdf:type, ulan:'Person'). Toplevel and debugger results are made readable again using portray/1 Can be written as

31 31 ICLP-09 A simple example ?- module(rdfs_entailment). rdfs_entailment: ?- rdf(X, rdf:type, ulan:'Person'), rdf(X, rdfs:label, literal('Matisse, Henri')), rdf(Work, dc:creator, X).

32 32 ICLP-09 Optimising ?- In = rdf(X, rdf:type, ulan:'Person'), rdf(X, rdfs:label, literal('Matisse, Henri')) rdf(Work, dc:creator, X), rdf_optimise(In, Goal). Goal = rdf(X, rdfs:label, literal('Matisse, Henri')), rdfql_carthesian([ bag([], rdf(X, rdf:type, ulan:'Person')), bag([Work], rdf(Work, dc:creator, X)) ])).

33 33 ICLP-09 Advantages of Prolog over SPARQL Flexibility and reuse:   We can mix with arbitrary Prolog code   We can name and combine queries   We can do recursion This is similar to SQL vs. Prolog, but   Processing RDF involves pattern-matching, rules and recursion, while datatypes are less important.

34 34 ICLP-09 Prolog ↔ SPARQL One SPARQL query to get result One SPARQL query to get result Multiple SPARQL queries Multiple SPARQL queries Fetch triple-by-triple and process in client Fetch triple-by-triple and process in client

35 35 ICLP-09 Reasoning Reasoning is connected to a language (RDFS, OWL, SWRL, …) Reasoning derives facts from the triple store that are not explicitly provided in the dataset.

36 36 ICLP-09 Options for Reasoning (I) Reasoning adds (virtual) triples (entailment):   The only API is rdf(S,P,O)   Forward reasoning Easy to implement Difficult to handle database updates Can explode using richer languages (e.g., OWL)   Backward reasoning Non-termination under SLD resolution Need for optimization of conjunctions Easy to provide alternative reasoners

37 37 ICLP-09 Alternative entailment reasoners as Prolog modules Core RDF-DB rdf/3 Core RDF-DB rdf/3 RDFS rdf/3 RDFS rdf/3 OWL-Horst rdf/3 OWL-Horst rdf/3....

38 38 ICLP-09 Options for Reasoning (II) Based on Abstract Syntax   Dedicated high-level API   Forward reasoning Transformation (Thea OWL(-2) library)   Backward reasoning Thea: http://www.semanticweb.gr/TheaOWLLib/http://www.semanticweb.gr/TheaOWLLib/   By Vangelis Vassiliadis and Chris Mungall

39 39 ICLP-09 Reasoning with Abstract Syntax API Core RDF-DB rdf/3 Core RDF-DB rdf/3.... Thea (OWL-2) subClassOf/2 Thea (OWL-2) subClassOf/2 Forward: Transformation RDFS rdfs_individual_of/2 rdfs_subclass_of/2... RDFS rdfs_individual_of/2 rdfs_subclass_of/2... Backward: Prolog rules

40 40 ICLP-09 Options for reasoning (summary) Entailment-based   Uniform query API → app can switch entailment   Query API is low-level   (Using forward reasoning) entailed graph is added to database   → Difficult to deal with multiple languages Abstract-syntax based   Each language has its own query API   Query API is high-level   Easy to deal with multiple languages

41 41 ICLP-09 A closer look at the RDF store: requirements Efficient in any instantiation-pattern (full indexing) Deal with property-hierarchy Deal with owl:sameAs Literal indexing (prefix, full-text,...) Scalable to 10-100 M-triples

42 42 ICLP-09 Options for rdf/3 (I: Using Prolog) Prolog dynamic database   We need multiple indexes (e.g., YAP)   Cannot exploit domain-specific aspects: Property-hierarchy matching Facts are ground, unordered and support limited types   Hard to provide statistics for the optimizer because they are also domain-specific

43 43 ICLP-09 Options for rdf/3 (II: Using an external store) External store   Slow connection (need to intern/extern URI-as- atom)   We do not want (most of) the reasoning

44 44 ICLP-09 Options for rdf/3 (III: Dedicated C) Using dedicated C-library   Can optimize for space based on limited datatypes   Use atom-handles in the database (no intern/extern)   Sort literals in an AVL-tree (prefix search)   Keep counts (for query optimizations)   Fast binary load/save format

45 45 ICLP-09 RDF Processing (summary) Expressing graph-patterns mixed with auxiliary Prolog is easy   This is enough for a large part of RDF processing in semantic web applications Reasoning   Forward closure (easy, big, no changes)   Backward: termination issues (tabling can help)   Extending rdf/3 ↔ Using abstract language

46 46 ICLP-09 Part III Web-Applications

47 47 ICLP-09 Database Web-Application Reference Architecture (Three Tier Model) Presentation generation Presentation generation Application Logic Application Logic Web 2.0 JavaScript Web Browser Web 3.0 (Semantic Web) RDF Linked Data

48 48 ICLP-09 Protocols and Standards RDF Database RDF Database Application Logic Application Logic HTTP SPARQL Prolog HTTP ?

49 49 ICLP-09 Prolog-to-HTTP Tomcat.NET... Tomcat.NET... JPL InterProlog PrologBeans... JPL InterProlog PrologBeans... Prolog Web-ServerInterfaceApplication Need to program in Tomcat/.NET/... & Prolog Difficult deployment JPL: One process (JNI/C interface) Fast, but hard to debug InterProlog/Prologbeans/... (proprietary network) HTTP

50 50 ICLP-09 Prolog-to-HTTP Easy debugging Easily extend the HTTP interface Not `industry standard' But … many languages provide an HTTP server library Prolog Web-ServerApplication Prolog library HTTP Prolog library HTTP Interface

51 51 ICLP-09 Apache Deployment Using Apache reverse-proxy and load-balancer     ServerName www.swi-prolog.org   ProxyPass / http://localhost:3040/ Prolog VNC Port 80 Port 3040

52 52 ICLP-09 VNC server console

53 53 ICLP-09 /api/search?q=picasso&count=100 :- use_module(library(http/http_dispatch)). :- use_module(library(http/http_parameters)). :- use_module(library(http/http_json)). :- http_handler('/api/search', search, []). search(Request) :- http_parameters(Request, [ q(Q, []), start(S, [default(0)]), count(C, [default(25)]) ]), search(Q, S, C, Results), reply_json(Results).

54 54 ICLP-09 Summary HTTP support Writing the HTTP-server in Prolog gives us:   Good single-language development environment   Incremental compilation: life-updating the server   Deployment can be direct or through a proxy Not so big: 12,000 lines for   Core HTTP client and server   HTML and JSON read/write   Parameters, sessions, authorization, logging

55 55 ICLP-09 Part IV Creating Interactive Web Applications using Prolog

56 56 ICLP-09 Web of Documents (Original drawing by Tim Burners Lee)

57 57 ICLP-09 Interactive Web-Applications Server needs to keep track of client (sessions) Client needs light-weight updates of the interface … but HTTP is state-less …

58 58 ICLP-09 Introducing State Negotiate a session-key between client and server   Server associates state with this key   Client modifies the interface using JavaScript → AJAX

59 59 ICLP-09 What is AJAX not?

60 60 ICLP-09 Case Create a web-interface for the N-queens problem   Interaction Select size of board Select implementation (Prolog ↔ clp(FD)) Get first solution Get next solution or stop   State in backtrackable Prolog program By   Torbjörn Lager, Markus Triska, Jan Wielemaker

61 61 ICLP-09 Step I: create initial page DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS

62 62 ICLP-09

63 63 ICLP-09 DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS Local Interaction Step II: Add local interaction

64 64 ICLP-09

65 65 ICLP-09 Options... <input type="button" id='opts' name="options" value="Options …" onClick="showOptions(true)"> function showOptions(show) { document.getElementById("options").style.display = show ? "block" : "none"; }

66 66 ICLP-09 OK: applyOptions() function applyOptions() { var size = parseInt(document.getElementById("size").value); if ( document.getElementById("queens").checked == true ) { algorithm = "queens"; } else { algorithm = "clpfd_queens"; } if ( size 40 ) { alert("Size must be in the range 2..40"); } else { boardsize = size; showOptions(false); document.getElementById("N").innerHTML = size; document.getElementById("who").innerHTML = (algorithm == "queens" ? "Prolog" : "clp(FD)"); document.getElementById("board").innerHTML = board(boardsize, boardwidth); } Set client state in global variables Set client state in global variables Update the interface by changing the DOM Update the interface by changing the DOM → NO server interaction

67 67 ICLP-09

68 68 ICLP-09 Step-III: Add server interaction DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS Local Interaction Server Interaction

69 69 ICLP-09 First... function first() { working(); YAHOO.util.Connect.asyncRequest( 'GET', "/prolog/first?goal="+algorithm+"("+boardsize+",L)", { success: update }); } <input type="button" id='first' name="first" value="First" onClick="first()"> Server request What to do when the server responds? What to do when the server responds?

70 70 ICLP-09 Client code-fragment: handle response function update(o) { var solution = YAHOO.lang.JSON.parse(o.responseText); if (solution.solution) { if ( solution.next == true ) { setButtons(true); } else { setButtons(false); } clearBoard(); setQueens(solution.solution.args[1].value); document.getElementById("msg").innerHTML = "CPU: " + solution.time.toPrecision(2) + " sec."; } else if ( solution.error ) { setButtons(false); document.getElementById("msg").innerHTML = " "+solution.error+" "; } else { setButtons(false); document.getElementById("msg").innerHTML = "There are no more solutions."; } Process as JSON Update DOM based on JSON reply Update DOM based on JSON reply

71 71 ICLP-09 setQueens() Replace DOM fragment Replace DOM fragment function setQueens(squareList) { for (var i = 1; i <= boardsize; i++) { var id = i + "-" + (squareList[i-1].value); document.getElementById(id).innerHTML = " "; }

72 72 ICLP-09

73 73 ICLP-09 Backtracking state in the server Thread session-1 Thread session-1 Thread session-N Thread session-N HTTP Worker thread HTTP Worker thread JSON Document JSON Document Backtrack Prolog-term GET /prolog/next session-id=1 State

74 74 ICLP-09 Backtracking state solve(Goal, Bindings, ThreadID) :- thread_self(Me), thread_statistics(Me, cputime, T0a), State = client(ThreadID, T0a), solve_2(Goal, Bindings, Solution), State = client(Client, T0), thread_statistics(Me, cputime, T1), Time is T1 - T0, solution_time(Solution, Time), nb_setarg(2, State, T1), debug(prolog_server, 'Sending: ~q', [Solution]), thread_send_message(Client, Solution), solution_type(Solution, Type), ( Type == last -> true ; Type == true -> catch( thread_get_message(command(From, Command)), _, Command = stop), debug(prolog_server, 'Command: ~q', [Command]), nb_setarg(1, State, From), Command == stop ; true ). (Guarded) actual goal (Guarded) actual goal Send reply Wait for user Wait for user

75 75 ICLP-09 AJAX has many architectures From http://www.openajax.org/member/wiki/Whitepaper_20060730

76 76 ICLP-09 Where does the JavaScript come from? Widget Library AjaxAnywhere, MochiKit, YUI,... Widget Library AjaxAnywhere, MochiKit, YUI,... User Code - Instantiation - Set attributes - Refine methods User Code - Instantiation - Set attributes - Refine methods

77 77 ICLP-09 Options for generating application JavaScript Write a JavaScript file and link it from the HTML page   Code is in two places → Good split if API is stable Poor for prototyping and often changing APIs Write JavaScript in Prolog strings and include in page   Messy syntax (Python """long string""")

78 78 ICLP-09 Generate from Prolog terms? Works well for HTML (e.g., html_write, PiLLoW) But, JavaScript customization often places code- fragments in object-properties   No simply interface such as e.g., XPCE: Create/Set property/Call method   A full mapping of JS code to Prolog syntax is probably not transparent enough for users

79 79 ICLP-09 Wrap-Up The “Web of data” is out there   Prolog is an excellent tool for processing RDF The interactive “Web 2.0” is out there   Web 2.0 is (relatively) language independent   Prolog is a suitable server component for Web 2.0

80 80 ICLP-09 Future Directions Enhance RDF support:   Improve scalability   Higher level reasoning Provide tabling Generalise optimizers Enhance web-programming support   Explore cleaner integration with AJAX Merge into Prolog-Commons Initiative

81 81 ICLP-09 Links http://www.swi-prolog.org http://e- culture.multimedian.nl/software/ClioPatria.shtml http://e- culture.multimedian.nl/software/ClioPatria.shtml http://www.swi-prolog.org/Publications.html

82 82 Part of the Dutch knowledge- economy project MultimediaN Partners: VU, CWI, UvA, DEN, ICN People: Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jacco van Ossenbruggen, Guus Schreiber Jos Taekema, Annemiek Teesing, Anna Tordai, Jan Wielemaker, Bob Wielinga Artchive.com, RKD, Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis) http://e-culture.multimedian.nl


Download ppt "1 ICLP-09 Enabling serendipitous search on the Web of Data using Prolog Jan Wielemaker VU University Amsterdam."

Similar presentations


Ads by Google