Presentation is loading. Please wait.

Presentation is loading. Please wait.

@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.

Similar presentations


Presentation on theme: "@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle."— Presentation transcript:

1 @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle development Presented by eBiquity Lab, CSEE, UMBC

2 1. Introduction Motivation Swoogle in the Semantic Web Glossary Swoogle Architecture SwoogleSwoogle SwoogleSwoogle

3 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Motivation (Google + Web) has made us all smarter something similar is needed by people and software agents for information on the semantic web

4 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC The Role of Swoogle in Semantic Web Semantic Web Services Semantic web data Software Agents, Applications SW data service database (Web) document RDF document uses Directory/Digest Service Service Finder digests searches Data Finder SwoogleSwoogle

5 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Concepts Explained wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ foaf:mbox rdfs:domain rdf:type rdf:Property Property Class SWO http://foo.com/foaf.rdf#finin foaf:mbox rdf:type finin@umbc.edu foaf:Person http://foo.com/foaf.rdf#finin SWI Individual SWD Term NOTE: Qualified Names (QName) are used to shorten well-known namespaces as follows rdf: => http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdfs: => http://www.w3.org/2000/01/rdf-schema foaf: => http://xmlns.com/foaf/1.0/ wordNet: => http://xmlns.com/wordnet/1.6/

6 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Glossary Document  A Semantic Web Document (SWD) is an online document written in semantic web languages (i.e. RDF and OWL).  An ontology document (SWO) is a SWD that contains mostly term definition (i.e. classes and properties). It corresponds to T-Box in Description Logic.  An instance document (SWI or SWDB) is a SWD that contains mostly class individuals. It corresponds to A-Box in Description Logic. Term  A term is a non-anonymous RDF resource which is the URI reference of either a class or a property. Individual  An individual refers to a non-anonymous RDF resource which is the URI reference of a class member. In swoogle, a document D is a valid SWD iff. JENA* correctly parses D and produces at least one triple. *JENA is a Java framework for writing Semantic Web applications. http://www.hpl.hp.com/semweb/jena2.htmhttp://www.hpl.hp.com/semweb/jena2.htm rdf:type rdfs:Class foaf:Person rdf:type foaf:Person http://.../foaf.rdf#finin

7 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Swoogle Architecture metadata creation data analysis interface SWD discovery SWD Metadata Web Service Web Server SWD Cache The Web Candidate URLs Web Crawler SWD Reader IR analyzerSWD analyzer Agent Service

8 2. Swoogle Research Discovery Digest Search & Navigation Rank Statistics SwoogleSwoogle SwoogleSwoogle

9 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Discovery - research Discovering URLs of possible SWD automatically  Google-crawler  Focused-crawler  Semantic-Web-crawler, e.g. scutter Revisiting URLs

10 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Discovery -- results Crawler performance  Google crawler is the best  Focused crawler needs to be improved Verified pure SWDs are only 1/3 of discovered URLs Some NSWDs contains embedded RDF graph. SWDNSWDUndecidedTOTAL Focused Crawler1,4657%10,58052%8,29220,337 google crawler273,02336%369,37149%110,794753,188 swd_crawler61,87015%285,50670%57,709405,085 TOTAL336,358 665,457 176,7951,178,610 Source: Swoogle (2005-Jan-05) SELECT `discovered_by`, sum(isRDF), sum(1-isRDF), count(*) FROM `digest_url` WHERE 1 group by discovered_by

11 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Digest -- research Document metadata  Annotative General metadata SWD metadata Ontology metadata  Inter-document relations  Document-term relations Term metadata  Term Definition  Inter-term Relation Class-property bond (C-P bond): rdfs:domain Property-Class bond (P-C bond): rdfs:range

12 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Document Metadata Web document metadata  When/how discovered/fetched  Suffix of URL  Last modified time  Document size SWD metadata  Language features OWL species RDF encoding  Statistical features # of Defined/used terms # of Declared/used namespaces Ontology Ratio  Ontology Rank Ontology annotation  Label  Version  Comment Relations  Links to other SWDs Imported SWDs Referenced SWDs Extended SWDs Prior version  Links to terms Classes/properties defined Classes/properties used

13 Digest “Time” Ontology (document view) Demo 2(a)

14 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Document-Term Relation foaf:mbox rdf:type finin@umbc.edu foaf:Person http://www.cs.umbc.edu/~finin/foaf.rdf wordNet:Agent rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ foaf:mbox rdfs:domain rdf:type rdf:Property populated Class defined Class populated Property defined Property http://foo.com/foaf.rdf#finin foaf:mbox rdf:type finin@umbc.edu foaf:Person http://foo.com/foaf.rdf defined Individual

15 Digest “Time” Ontology (term view) Demo 2(b) ………….

16 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Term Metadata Term Definition rdfs:subClassOf -- foaf:Agent rdfs:label – “Person” C-P bond (from SWI) foaf:name dc:title C-P bond (from SWO) foaf:mbox foaf:name foaf:mbox rdfs:domain Onto 1 owl:Class rdf:type “Person” rdfs:label foaf:Agent rdfs:subClassOf Onto 2 foaf:name rdf:type “Tim Finin” SWD3 foaf:Person

17 Digest Term “Person” Demo 4

18 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Term Distribution (grouped by local name) case-insensitivecase-sensitive Name656 1 name560 11 source129 Person399 2 Person357 12email125 Title349 3 title292 13 Book124 Location334 4 description242 14 address121 Description288 5 location213 15 Event117 Date257 6 type196 16 Location114 Type242 7 date173 17 author111 country236 8 value154 18 Animal111 Address212 9 Organization134 19 Country104 organization186 10 country130 20 language103 total 72502 total 76827

19 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Digest -- result typePop.Def.# term Total Terms# populated Total populated class01 83,60288% 00% 10 3,9544% 1,002,96113% 11 7,0657%94,6216,483,48587%7,486,446 property01 42,85373% 00% 10 8,31214% 2,438,4556% 11 7,83613%59,00136,899,84294%39,338,297 Ontological Term Distribution (populated, defined) Source: Swoogle (2005-Jan-05) SELECT res_type,sign(cnt_instance_populate>0), sign(cnt_swd_def>0),count(*), sum(cnt_instance_populate) FROM `digest_term` WHERE 1 group by res_type, sign(cnt_instance_populate>0), sign(cnt_swd_def>0)

20 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Search & Navigation -- research The Semantic Web is not the Web Search service  Document search – RDF document is not free text  Term search – URIref and compound local name Navigation service  The RDF graph – Typed links  The web of RDF documents – Few hyperlinks  The social network of agents – trust & provenance

21 Find “Time” Ontology We can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology. Demo 1

22 Find Term “Person” Demo 3 Not capitalized! URIref is case sensitive!

23 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Current Swoogle Navigation Model A URIref refers to  A term, i.e. instance of RDFS class/property  An individual, i.e. populated terms A SWD could be  SWO: term definition  SWI: individuals Observations  RDF Resources are semantically linked in RDF graph  SWDs are poorly linked due to the absence of explicit hyperlink concept  Ontologies are more interesting Approach  Build inter-document relations  Rational surfing model SWOs SWIs HTML documents Images Audio files Video files

24 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC URL URIref Semantic Web Navigation Model new! Resource RDF Document populatesClass populatesProperty refersClass refersProperty definesClass definesProperty rdfsOntology owldlOntology owl:imports owl:priorVersion owl:backwardCompatibleWith owl:imcompatiableWith rdfs:seeAlso rdfs:isDefinedBy Ontology Namespace isDefinedBy isUsedBy usesNamespace rdfs:subClassOf sameNamespace sameLocalname RDF Graph Navigation … Term Search Document Search

25 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Ranking -- research Surfing models Ranking method  PageRank variation What to rankScopeIdea Rational surfing modelSWDSemantic WebSummarize inter-document relation as EX, TM, IM, PV Plain Graph ModelResourceRDF graphRDF graph is browsed as a weighted directed graph RDFS-based ModelResourceRDF graphRDF graph is browsed only with RDFS semantics SW navigation modelResource & SWD Semantic WebAssume Swoogle is used in navigation

26 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Ranking with Rational Surfing Model: An Example foaf:mbox rdf:type finin@umbc.edu foaf:Person http://www.cs.umbc.edu/~finin/foaf.rdf wordNet:Person rdf:type rdfs:Class rdfs:subClassOf foaf:Person http://xmlns.com/foaf/1.0/ TM http://www.w3.org/2000/01/rdf-schema rdfs:subClassOf rdf:Property rdf:type http://xmlns.com/wordnet/1.6/ rdfs:Class rdf:type wordNet:Individual rdfs:subClassOf wordNet:Person EX

27 Demo 6 Swoogle’ top 10 This report is dynamically generated based on the latest data, and it will take 5 to 10 seconds. Swoogle use PageRank like algorithm to rank semantic web documents. Well-known ontologies are highly ranked.

28 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Statistics – research Summarize the dataset collected by Swoogle  Swoogle Watch Swoogle Today Distribution of visited URLs Document discovery log Term discovery log  Semantic Web Watch SWD distribution by last-modified month SWD distribution by website SWD distribution by suffix  Ontology Watch Term (class/property) usage Namespace usage

29 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Demo 5(a) Swoogle Today

30 Demo 5(b) Swoogle Statistics FOAF Trustix W3C Stanford

31 Demo 5(c) Swoogle Statistics

32 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Miscellaneous Submit URL for focused Crawler Swoogle Web Service (Delivered in Sept.) http://swoogle.umbc.edu/webservice/ http://swoogle.umbc.edu/webservice/  Search document  Search term  Term digest

33 When you can’t find your ontologies in Swoogle, it may be the case that your ontologies are not indexed by swoogle yet. Please submit it and increase its visibility. From site map When your query fails Demo 7 Submit URL for focused crawler

34 3. Summary Summary Current Status SwoogleSwoogle SwoogleSwoogle

35 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Summary Swoogle (Mar, 2004) Swoogle2 (Sep, 2004) Swoogle3  Automated SWD discovery  SWD metadata creation and search  Ontology rank (rational surfer model)  Swoogle watch  Web Interface  Ontology dictionary  Swoogle statistics  Web service interface (WSDL)  Bag of URIref IR search  Better discovery & revisit strategies  Better navigation models  Semantic web dataset  Index Instance data  More metadata (ontology mapping)  Better web service interfaces 2005 2004

36 @ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Current Status Swoogle Watch reported (Jan 6, 2005)  46.7 M triples  336 K SWDs: 4k ontologies  153 K terms: 94K classes & 59K properties Ongoing work  Research Self-adaptive SWD Discovery Efficient SWD digest and RDF Graph Abstract Semantic Web navigation model  Engineering Enhancing Web Service interface


Download ppt "@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle."

Similar presentations


Ads by Google