Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finding knowledge, data and answers on the Semantic Web

Similar presentations


Presentation on theme: "Finding knowledge, data and answers on the Semantic Web"— Presentation transcript:

1 Finding knowledge, data and answers on the Semantic Web
Tim Finin University of Maryland, Baltimore County Joint work with Li Ding, Anupam Joshi, Yun Peng, Cynthia Parr, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi  This work was partially supported by DARPA contract F , NSF grants CCR and IIS and grants from IBM, Fujitsu and HP.

2 This talk Motivation Swoogle Semantic Web search engine
Use cases and applications Conclusions

3 Google has made us smarter
Software agents will need something similar to maximize the use of information on the semantic web.

4 But what about our agents?
Software agents will need something similar to maximize the use of information on the semantic web. tell register Agents still have a very minimal understanding of text and images.

5 But what about our agents?
Swoogle Swoogle Swoogle tell register Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Software agents will need something similar to maximize the use of information on the semantic web. Swoogle A Google for knowledge on the Semantic Web is needed by software agents and programs

6 This talk Motivation Swoogle Semantic Web search engine
Use cases and applications Conclusions

7 Running since summer 2004 1.6M RDF docs, 300M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 420 registered users

8 Swoogle Architecture Analysis Index Discovery Search Services …
IR Indexer Search Services Semantic Web metadata Web Service Server Candidate URLs Bounded Web Crawler Google Crawler SwoogleBot SWD Indexer Ranking document cache SWD classifier human machine html rdf/xml the Web Information flow Swoogle‘s web interface Legends

9 This talk Motivation Swoogle Semantic Web search engine
Use cases and applications Conclusions

10 Applications and use cases
Supporting Semantic Web developers Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors, statistics, etc. Searching specialized collections Spire: aggregating observations and data from biologists InferenceWeb: searching over and enhancing proofs SemNews: Text Meaning of news stories Supporting SW tools Triple shop: finding data for SPARQL queries 1 2 3

11 1

12 80 ontologies were found that had these three terms
By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size. Let’s look at this one

13 Basic Metadata hasDateDiscovered:   hasDatePing:   hasPingState:  PingModified type:  SemanticWebDocument isEmbedded:  false hasGrammar:  RDFXML hasParseState:  ParseSuccess hasDateLastmodified:   hasDateCache:   hasEncoding:  ISO hasLength:  18K hasCntTriple:  311.00 hasOntoRatio:  0.98 hasCntSwt:  94.00 hasCntSwtDef:  72.00 hasCntInstance:  8.00

14

15 rdfs:range was used 41 times to assert a value.
owl:ObjectProperty was instantiated 28 times time:Cal… defined once and used 24 times (e.g., as range)

16 All of this is available in RDF form for the agents among us.
These are the namespaces this ontology uses. Clicking on one shows all of the documents using the namespace. All of this is available in RDF form for the agents among us.

17 Here’s what the agent sees
Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.

18 We can also search for terms (classes, properties) like terms for “person”.

19 10K terms associated with “person”! Ordered by use.
Let’s look at foaf:Person’s metadata

20

21

22

23 87K documents used foaf:gender with a foaf:Person instance as the subject

24 3K documents used dc:creator with a foaf:Person instance as the object

25 Swoogle’s archive saves every version of a SWD it’s seen.

26

27 2 An NSF ITR collaborative project with
University of Maryland, Baltimore County University of Maryland, College Park U. Of California, Davis Rocky Mountain Biological Laboratory

28 An invasive species scenario
Nile Tilapia fish have been found in a California lake. Can this invasive species thrive in this environment? If so, what will be the likely consequences for the ecology? So…we need to understand the effects of introducing this fish into the food web of a typical California lake

29 Food Webs A food web models the trophic (feeding) relationships between organisms in an ecology Food web simulators are used to explore the consequences of changes in the ecology, such as the introduction or removal of a species A locations food web is usually constructed from studies of the frequencies of the species found there and the known trophic relations among them. Goal: automatically construct a food web for a new location using existing data and knowledge ELVIS: Ecosystem Location Visualization and Information System

30 East River Valley Trophic Web
The web structure in the image is organized vertically, with node color representing trophic level. Red nodes represent basal species, such as plants and detritus, orange nodes represent intermediate species, and yellow nodes represent top species or primary predators. Links characterize the interaction between two nodes, and the width of the link attenuates down the trophic cascade (i.e. a link is thicker at the predator end and thinner at the prey end).

31 Species List Constructor
Click a county, get a species list

32 The problem We have data on what species are known to be in the location and can further restrict and fill in with other ecological models But we don’t know which of these the Nile Tilapia eats of who might eat it. We can reason from taxonomic data (simlar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.

33

34 Predict food web links using database and taxonomic reasoning.
Food Web Constructor Predict food web links using database and taxonomic reasoning. In an new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected

35 Examine evidence for predicted links.
Evidence Provider Examine evidence for predicted links.

36 Status Goal is ELVIS (Ecosystem Location Visualization and Information System) as an integrated set of web services for constructing food webs for a given location. Background ontologies SpireEcoConcepts: concepts and properties to represent food webs, and ELVIS related tasks, inputs and outputs ETHAN (Evolutionary Trees and Natural History) Concepts and properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources Under development Connect to visualization software Connect to triple shop to discover more data

37 3 UMBC Triple Shop http://sparql.cs.umbc.edu/
Online SPARQL RDF query processing with several interesting features Automatically finds SWDs for give queries using Swoogle backend database Datasets, queries and results can be saved, tagged, annotated, shared, searched for, etc. RDF datasets as first class objects Can be stored on our server or downloaded Can be materialized in a database or (soon) as a Jena model

38 Web-scale semantic web data access
agent data access service the Web Index RDF data ask (“person”) Search vocabulary Search URIrefs in SW vocabulary inform (“foaf:Person”) Compose query ask (“?x rdf:type foaf:Person”) Populate RDF database Search URLs in SWD index inform (doc URLs) Fetch docs Query local RDF database

39 Who knows Anupam Joshi? Show me their names, address and pictures

40 The UMBC ebiquity site publishes lots of RDF data, including FOAF profiles

41 PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?p2name ?p2mbox ?p2pix FROM ??? WHERE { ?p1 foaf:surname "Joshi" ?p1 foaf:firstName “Anupam" . ?p1 foaf:mbox ?p1mbox . ?p2 foaf:knows ?p3 . ?p3 foaf:mbox ?p1mbox . ?p2 foaf:name ?p2name . ?p2 foaf:mbox ?p2mbox . OPTIONAL { ?p2 foaf:depiction ?p2pix } . } ORDER BY ?p2name No FROM clause!

42 log in specify dataset Enter query w/o FROM clause!

43

44

45 302 RDF documents were found that might have useful data.

46 We’ll select them all and add them to the current dataset.

47 We’ll run the query against this dataset to see if the results are as expected.

48 The results can be produced in any of several formats

49

50 Looks like a useful dataset
Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.

51

52 We can also annotate, save and share queries.

53 Work in Progress There are a host of performance issues
We plan on supporting some special datasets, e.g., FOAF data collected from Swoogle Definitions of RDF and OWL classes and properties from all ontologies that Swoogle has discovered Expanding constraints to select candidate SWDs to include arbitrary metadata and embedded queries FROM “documents trusted by a member of the SPIRE project” We will explore two models for making this useful As a downloadable application for client machines As an (open source?) downloadable service for servers supporting a community of users.

54 This talk Motivation Swoogle Semantic Web search engine
Use cases and applications State of the Semantic Web Conclusions

55 Will Swoogle Scale? How? Here’s a rough estimate of the data in RDF documents on the semantic web based on Swoogle’s crawling System/date Terms Documents Individuals Triples Bytes Swoogle2 1.5x105 3.5x105 7x106 5x107 7x109 Swoogle3 2x105 7x105 1.5x107 7.5x107 1x1010 2006 1x106 5x109 5x1011 2008 5x106 5x1013 We think Swoogle’s centralized approach can be made to work for the next few years if not longer.

56 How much reasoning should Swoogle do?
SwoogleN (N<=3) does limited reasoning It’s expensive It’s not clear how much should be done More reasoning would benefit many use cases e.g., type hierarchy Recognizing specialized metadata E.g., that ontology A some maps terms from B to C

57 A RDF Dictionary We’d hope to develop an RDF dictionary.
Given an RDF term, returns a graph of its definiton Term  definition from “official” ontology Term+URL  definition from SWD at URL Term+*  union definition Optional argument recursively adds definitions of terms in definition excluding RDFS and OWL terms Optional arguments identifies more namespaces to exclude

58 Conclusion The web will contain the world’s knowledge in forms accessible to people and computers We need better ways to discover, index, search and reason over SW knowledge SW search engines address different tasks than html search engines So they require different techniques and APIs Swoogle like systems can help create consensus ontologies and foster best practices Swoogle is for Semantic Web 1.0 Semantic Web 2.0 will make different demands

59 For more information Annotated in OWL


Download ppt "Finding knowledge, data and answers on the Semantic Web"

Similar presentations


Ads by Google