UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

UMBC an Honors University in Maryland The Semantic Web … It Just Might Work. Joel Sachs Joint work with: Cyndy Parr, Andriy Parafiynyk,
UMBC an Honors University in Maryland Examples of Integrating Ecological Information on the Semantic Web Joel Sachs and Cynthia Simms Parr contact:
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Gail Hodge Information International Associates, Inc. US Geological Survey, Consultant Joel Sachs Ebiquity Lab, University of Maryland Baltimore County.
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
UMBC AN HONORS UNIVERSITY IN MARYLAND Future Research Challenges and Needed Resources for The Web, Semantics and Data Mining Tim Finin UMBC, Baltimore.
Roi Adadi David Ben-David.  Semantic Web Document (SWD) ◦ A web page that serializes an RDF graph. ◦ Uses one of the recommended RDF syntax languages,
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Swoogle Swoogle Semantic Search Engine Web-enhanced Information Management Bin Wang.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection Boanerges Aleman-Meza, Meenakshi Nagarajan,
UMBC an Honors University in Maryland 1 Knowledge Sharing on the Semantic Web Tim Finin University of Maryland, Baltimore County Department of Homeland.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Finding knowledge, data and answers on the Semantic Web
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
The Semantic Web Web Science Systems Development Spring 2015.
@ Swoogle Tutorial (Part II: Swoogle Demo) A canned demo Use-case: UMBC tree survey Presented by eBiquity Lab, CSEE, UMBC.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
© 2008 IBM Corporation ® Atlas for Lotus Connections Unlock the power of your social network! Customer Overview Presentation An IBM Software Services for.
UMBC an Honors University in Maryland 1 Adding Semantics to Social Websites for Citizen Science Pranam Kolari University of Maryland, Baltimore County.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Research support was provided by NSF, award NSF-ITR-IIS , PI Tim Finin, UMBC. SPIRE Semantic Prototypes in Research Ecoinfomatics Approach We are.
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.
Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
UMBC an Honors University in Maryland 1 Search Engines for Semantic Web Knowledge Tim Finin University of Maryland, Baltimore County Joint work with Li.
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin University of Maryland,
UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin 1, Anupam Joshi 1, Li.
UMBC an Honors University in Maryland 1 Using the Semantic Web to Support Ecoinformatics Andriy Parafiynyk University of Maryland, Baltimore County
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Using linked data to interpret tables Varish Mulwad September 14,
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
UMBC an Honors University in Maryland 1 Finding and Ranking Knowledge on the Semantic Web Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun Peng and Pranam.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Blog Track Open Task: Spam Blog Detection Tim Finin Pranam Kolari, Akshay Java, Tim Finin, Anupam Joshi, Justin.
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
Making Software Agents Smarter Tim Finin University of Maryland, Baltimore County ICAART 2010, 22 January 2010
UMBC an Honors University in Maryland 1 Searching for Knowledge and Data on the Semantic Web Tim Finin University of Maryland, Baltimore County
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
1 Web Services for Semantic Interoperability and Integration Tim Finin University of Maryland, Baltimore County Dagstuhl, 20 September 2004
Spire Semantic Prototypes In Ecoinformaics UMBC CS UMBC CS UMD MIND SWAP UMD MIND SWAP UMBC GEST UMBC GEST NASA GSFC NASA GSFC RMBL Peace RMBL Peace UC.
@ How the Semantic Web is Being Used: An Analysis of FOAF Documents Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University.
Swoogle: A Semantic Web Search and Metadata Engine Li Ding, Tim Finin, Anupam Joshi, Rong Pan, R. Scott Cost, Yun Peng Pavan Reddivari, Vishal Doshi, Joel.
Finding knowledge, data and answers on the Semantic Web
Information Retrieval and the Semantic Web
Multi-agent system for web services
SWD = SWO + SWI SWD Rank SWD IR Engine
Web Services for Semantic Interoperability and Integration
Presented by ebiqity UMBC Nov, 2004
Wikitology Wikipedia as an Ontology
Visit Swoogle web site at
OntoRank for RDF documents
Presentation transcript:

UMBC an Honors University in Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County Joint work with Li Ding, Anupam Joshi, Yun Peng, Cynthia Parr, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi  This work was partially supported by DARPA contract F , NSF grants CCR and IIS and grants from IBM, Fujitsu and HP.

UMBC an Honors University in Maryland 2 This talk Motivation Swoogle Semantic Web search engine Use cases and applications Observations Conclusions

UMBC an Honors University in Maryland 3 Google has made us smarter

UMBC an Honors University in Maryland 4 But what about our agents? tell register Agents still have a very minimal understanding of text and images.

UMBC an Honors University in Maryland 5 But what about our agents? A Google for knowledge on the Semantic Web is needed by software agents and programs Swoogle tell register

UMBC an Honors University in Maryland 6 This talk Motivation Swoogle Semantic Web search engine Use cases and applications Observations Conclusions

UMBC an Honors University in Maryland 7 Running since summer M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users Running since summer M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users

UMBC an Honors University in Maryland 8 Analysis Index Discovery IR Indexer Search Services Semantic Web metadata Web Service Web Server Candidate URLs Bounded Web Crawler Google Crawler SwoogleBot SWD Indexer Ranking document cache SWD classifier human machine htmlrdf/xml … the Web Semantic Web Information flowSwoogle‘s web interface Legends Swoogle Architecture

UMBC an Honors University in Maryland 9 A Hybrid Harvesting Framework Manual submission RDF crawlingBounded HTML crawlingMeta crawling Seeds MSeeds H Seeds R Swoogle Sample Dataset Inductive learner the Web Google API call crawl true would google

UMBC an Honors University in Maryland 10 Performance – Site Coverage SW06MAR - Basic statistics (Mar 31, 2006) – 1.3M SWDs from 157K websites – 268M triples – 61K SWOs including >10K in high quality –1.4M SWTs using 12K namespaces Significance –Compare with existing works ( DAML crawler, scutter ) –Compare SW06MAR with Google ’ s estimated SWDs SWDs per website Website

UMBC an Honors University in Maryland 11 Performance – crawlers’ contribution High SWD ratio: 42% URLs are confirmed as SWD Consistent growth rate: 3000 SWDs per day RDF crawler: best harvesting method HTML crawler: best accuracy Meta crawler: best in detecting websites # of documents

UMBC an Honors University in Maryland 12 This talk Motivation Swoogle Semantic Web search engine Use cases and applications Observations Conclusions

UMBC an Honors University in Maryland 13 Applications and use cases Supporting Semantic Web developers –Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors, statistics, etc. Searching specialized collections –Spire: aggregating observations and data from biologists –InferenceWeb: searching over and enhancing proofs –SemNews: Text Meaning of news stories Supporting SW tools –Triple shop: finding data for SPARQL queries 1 2 3

UMBC an Honors University in Maryland 14 1

UMBC an Honors University in Maryland 15 By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size. 80 ontologies were found that had these three terms Let’s look at this one

UMBC an Honors University in Maryland 16 Basic Metadata hasDateDiscoveredhasDateDiscovered: hasDatePinghasDatePing: hasPingStatehasPingState: PingModified typetype: SemanticWebDocument isEmbeddedisEmbedded: false hasGrammarhasGrammar: RDFXML hasParseStatehasParseState: ParseSuccess hasDateLastmodifiedhasDateLastmodified: hasDateCachehasDateCache: hasEncodinghasEncoding: ISO hasLengthhasLength: 18K hasCntTriplehasCntTriple: hasOntoRatiohasOntoRatio: 0.98 hasCntSwthasCntSwt: hasCntSwtDefhasCntSwtDef: hasCntInstancehasCntInstance: 8.00

UMBC an Honors University in Maryland 17

UMBC an Honors University in Maryland 18 rdfs:range was used 41 times to assert a value. owl:ObjectProperty was instantiated 28 times time:Cal… defined once and used 24 times (e.g., as range)

UMBC an Honors University in Maryland 19 These are the namespaces this ontology uses. Clicking on one shows all of the documents using the namespace. All of this is available in RDF form for the agents among us.

UMBC an Honors University in Maryland 20 Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.

UMBC an Honors University in Maryland 21 We can also search for terms (classes, properties) like terms for “person”.

UMBC an Honors University in Maryland 22 10K terms associated with “person”! Ordered by use. Let’s look at foaf:Person’s metadata

UMBC an Honors University in Maryland 23

UMBC an Honors University in Maryland 24

UMBC an Honors University in Maryland 25

UMBC an Honors University in Maryland 26 87K documents used foaf:gender with a foaf:Person instance as the subject

UMBC an Honors University in Maryland 27 3K documents used dc:creator with a foaf:Person instance as the object

UMBC an Honors University in Maryland 28 Swoogle’s archive saves every version of a SWD it’s seen.

UMBC an Honors University in Maryland 29

UMBC an Honors University in Maryland 30 2 An NSF ITR collaborative project with University of Maryland, Baltimore County University of Maryland, College Park U. Of California, Davis Rocky Mountain Biological Laboratory An NSF ITR collaborative project with University of Maryland, Baltimore County University of Maryland, College Park U. Of California, Davis Rocky Mountain Biological Laboratory

UMBC an Honors University in Maryland 31 An invasive species scenario Nile Tilapia fish have been found in a California lake. Can this invasive species thrive in this environment? If so, what will be the likely consequences for the ecology? So…we need to understand the effects of introducing this fish into the food web of a typical California lake

UMBC an Honors University in Maryland 32 Food Webs A food web models the trophic (feeding) relationships between organisms in an ecology –Food web simulators are used to explore the consequences of changes in the ecology, such as the introduction or removal of a species –A locations food web is usually constructed from studies of the frequencies of the species found there and the known trophic relations among them. Goal: automatically construct a food web for a new location using existing data and knowledge ELVIS: Ecosystem Location Visualization and Information System

UMBC an Honors University in Maryland 33 East River Valley Trophic Web

UMBC an Honors University in Maryland 34 Species List Constructor Click a county, get a species list

UMBC an Honors University in Maryland 35 The problem We have data on what species are known to be in the location and can further restrict and fill in with other ecological models But we don’t know which of these the Nile Tilapia eats of who might eat it. We can reason from taxonomic data (simlar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.

UMBC an Honors University in Maryland 36

UMBC an Honors University in Maryland 37 Food Web Constructor Predict food web links using database and taxonomic reasoning. In an new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected

UMBC an Honors University in Maryland 38 Evidence Provider Examine evidence for predicted links.

UMBC an Honors University in Maryland 39 Status Goal is ELVIS (Ecosystem Location Visualization and Information System) as an integrated set of web services for constructing food webs for a given location. Background ontologies –SpireEcoConcepts: concepts and properties to represent food webs, and ELVIS related tasks, inputs and outputs –ETHAN (Evolutionary Trees and Natural History) Concepts and properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources Under development –Connect to visualization software –Connect to triple shop to discover more data

UMBC an Honors University in Maryland 40 UMBC Triple Shop Online SPARQL RDF query processing with several interesting features Automatically finds SWDs for give queries using Swoogle backend database Datasets, queries and results can be saved, tagged, annotated, shared, searched for, etc. RDF datasets as first class objects –Can be stored on our server or downloaded –Can be materialized in a database or (soon) as a Jena model 3

UMBC an Honors University in Maryland 41 Web-scale semantic web data access agent data access servicethe Web ask (“person”) Search vocabulary ask (“?x rdf:type foaf:Person”) inform (“foaf:Person”) Fetch docs Populate RDF database Query local RDF database inform (doc URLs) Search URIrefs in SW vocabulary Search URLs in SWD index Compose query Index RDF data

UMBC an Honors University in Maryland 42 Who knows Anupam Joshi? Show me their names, address and pictures

UMBC an Honors University in Maryland 43 The UMBC ebiquity site publishes lots of RDF data, including FOAF profiles

UMBC an Honors University in Maryland 44 No FROM clause! PREFIX foaf: SELECT DISTINCT ?p2name ?p2mbox ?p2pix FROM ??? WHERE { ?p1 foaf:surname "Joshi". ?p1 foaf:firstName “Anupam". ?p1 foaf:mbox ?p1mbox. ?p2 foaf:knows ?p3. ?p3 foaf:mbox ?p1mbox. ?p2 foaf:name ?p2name. ?p2 foaf:mbox ?p2mbox. OPTIONAL { ?p2 foaf:depiction ?p2pix }. } ORDER BY ?p2name

UMBC an Honors University in Maryland 45 Enter query w/o FROM clause! log in specify dataset

UMBC an Honors University in Maryland 46

UMBC an Honors University in Maryland 47

UMBC an Honors University in Maryland RDF documents were found that might have useful data.

UMBC an Honors University in Maryland 49 We’ll select them all and add them to the current dataset.

UMBC an Honors University in Maryland 50 We’ll run the query against this dataset to see if the results are as expected.

UMBC an Honors University in Maryland 51 The results can be produced in any of several formats

UMBC an Honors University in Maryland 52

UMBC an Honors University in Maryland 53 Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.

UMBC an Honors University in Maryland 54

UMBC an Honors University in Maryland 55 We can also annotate, save and share queries.

UMBC an Honors University in Maryland 56 Work in Progress There are a host of performance issues We plan on supporting some special datasets, e.g., –FOAF data collected from Swoogle –Definitions of RDF and OWL classes and properties from all ontologies that Swoogle has discovered Expanding constraints to select candidate SWDs to include arbitrary metadata and embedded queries –FROM “documents trusted by a member of the SPIRE project” We will explore two models for making this useful –As a downloadable application for client machines –As an (open source?) downloadable service for servers supporting a community of users.

UMBC an Honors University in Maryland 57 This talk Motivation Swoogle Semantic Web search engine Use cases and applications Observations Conclusions

UMBC an Honors University in Maryland 58 Will Swoogle Scale? How? Here’s a rough estimate of the data in RDF documents on the semantic web based on Swoogle’s crawling System/dateTermsDocumentsIndividualsTriplesBytes Swoogle21.5x x10 5 7x10 6 5x10 7 7x10 9 Swoogle32x10 5 7x x x10 7 1x x10 6 5x10 7 5x10 9 5x x10 6 5x10 9 5x x10 13 We think Swoogle’s centralized approach can be made to work for the next few years if not longer.

UMBC an Honors University in Maryland 59 How much reasoning should Swoogle do? SwoogleN (N<=3) does limited reasoning –It’s expensive –It’s not clear how much should be done More reasoning would benefit many use cases –e.g., type hierarchy Recognizing specialized metadata –E.g., that ontology A some maps terms from B to C

UMBC an Honors University in Maryland 60 A RDF Dictionary We hope to develop an RDF dictionary. Given an RDF term, returns a graph of its definiton –Term  definition from “official” ontology –Term+URL  definition from SWD at URL –Term+*  union definition –Optional argument recursively adds definitions of terms in definition excluding RDFS and OWL terms –Optional arguments identifies more namespaces to exclude

UMBC an Honors University in Maryland 61 This talk Motivation Swoogle Semantic Web search engine Use cases and applications Observations Conclusions

UMBC an Honors University in Maryland 62 Conclusion The web will contain the world’s knowledge in forms accessible to people and computers –We need better ways to discover, index, search and reason over SW knowledge SW search engines address different tasks than html search engines –So they require different techniques and APIs Swoogle like systems can help create consensus ontologies and foster best practices –Swoogle is for Semantic Web 1.0 –Semantic Web 2.0 will make different demands

UMBC an Honors University in Maryland 63 Annotated in OWL For more information