Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Ontologies for mobile and pervasive computing a tutorial offered at Mobiquitous 2004 Tim Finin and Harry Chen 22 August 2004

Similar presentations


Presentation on theme: "1 Ontologies for mobile and pervasive computing a tutorial offered at Mobiquitous 2004 Tim Finin and Harry Chen 22 August 2004"— Presentation transcript:

1 1 Ontologies for mobile and pervasive computing a tutorial offered at Mobiquitous 2004 Tim Finin and Harry Chen 22 August 2004

2 UMBC an Honors University in Maryland 2 1 Introduction

3 UMBC an Honors University in Maryland 3 Agenda 1 Introduction (10 Finin/Chen) 2 Ontologies (25 Finin) 3 Semantic web (40 Finin) 4 Uses in mobile and pervasive computing (30 Chen) Break (30) 5 Example ontologies (20 Chen) 6 Ontology engineering (25 Finin) 7 Nuts and bolts (35 Chen) 8 Current research topics (10 Finin) 9 Closing and discussion (15 Finin/Chen)

4 UMBC an Honors University in Maryland 4 Goals of the tutorial We will provide the following in a format that can be understood by anyone with a strong background in computer science  A brief introduction to ontologies and their use in building information systems  An overview of the Semantic Web and it’s approach to defining and using ontologies  A closer look at how ontologies can be used to support mobile and pervasive computing applications along with examples of useful ontologies  Examples of current tools and techniques  Some open issues and current research areas

5 UMBC an Honors University in Maryland 5 Who we are  And where we are coming from …  Background and experience in AI, knowledge-based systems, etc  Interested in building intelligent mobile and ubiquitous systems  Tim Finin  Professor of Computer Science at UMBC  Harry Chen  PhD student in Computer Science at UMBC  See bio sketches at end or visit web pages

6 UMBC an Honors University in Maryland 6 Motivation

7 UMBC 7 2 Ontologies

8 UMBC an Honors University in Maryland 8 Why is this funny? In “The analytical language of John Wilkins”*, Jorge Borges writes about a “certain Chinese encyclopaedia” that has the following categorization of animals: * (i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from a long way off look like flies. (a) belonging to the emperor, (b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification,

9 UMBC an Honors University in Maryland 9 What’s an ontology? -- In Philosophy --  Branch of metaphysics dealing with the nature of being  “Ontology” is from the Greek ontos for being and logos for word  An ontology is a theory of what exists  It lets us experience, operate in, and talk about the world  As Plato put it, we need to “carve nature at its joints”  Successful communication requires a shared ontology Aristotle described an ontology of basic categories of predicates to des-cribe the world (Sowa, after Brentano)

10 UMBC an Honors University in Maryland 10 What’s an ontology? -- In Organized Societies --  A dictionary is an ontology of sorts.  But, ordinary people seldom need or use a dictionary in everyday life.  Human organizations, like the EPA, do need to develop standards for terms and phrases  These typically give a specialized meaning that is unambiguous, different from and/or narrower than the ordinary interpretation.  These are usually given as a glossary or thesaurus of specialized terms

11 UMBC an Honors University in Maryland 11 Example: EPA’s “Terms of Environment”

12 UMBC an Honors University in Maryland 12 What’s an ontology? -- in Information Systems --  An explicit formal specification of how to represent the objects, concepts and other domain entities and relationships among them.  Common examples: UML diagrams, Data dictionaries, DB schema, Conceptual Schemas, API descriptions, Knowledge Bases, etc.  Ontologies provide an abstract conceptualization of information to be represented and a vocabulary of terms to use in the representation.  Interoperability between two systems requires them to share a common ontology.

13 UMBC an Honors University in Maryland 13 UML diagrams as Ontologies

14 UMBC an Honors University in Maryland 14 DB schemas as Ontologies

15 UMBC an Honors University in Maryland 15 Knowledge bases as Ontologies

16 UMBC an Honors University in Maryland 16 Conceptual schemas as ontologies  Databases are opaque  Typical DB schemas don’t help much  Conceptual schemas gives intended meaning of concepts used in a DB  We assume they “bottom out” on common ontologies “understood” by the programs involved  If not, we fail to capture the meaning … Table: price *stockNo: integer; cost: float Auto Product Ontology Product Ontology Units & Measures Ontology price(x, y) =>  (x’, y’) [auto_part(x’) & part_no(x’) = x & retail_price(x’, y’, Value-Inc) & magnitude(y’, US_dollars) = y]

17 UMBC an Honors University in Maryland 17 Our Focus  We’re technologists, rather than philosophers or bureaucrats, so our focus is on IT and ontologies.  Making machine understandable ontologies  Exploring how they can be used  Exploring what “machine understandable” means  Supporting other uses of ontologies with IT  Knowledge management for NL ontologies

18 UMBC an Honors University in Maryland 18 Top down vs. bottom up  Philosophers build from the top down and are interested in capturing the most general concepts.  Programmers tend to work from the bottom up, supporting a set of applications, with a little generality to help reuse and future development.  Ex: CHAT-80 system (Periera and Warren, 1982) which answered NL questions about a geographic database.  Example of a microworld ontology supported NLP, query answering, and generation

19 UMBC an Honors University in Maryland 19 Tree of Porphyry The oldest known tree diagram is the 3rd century AD work by Greek philosopher Porphyry in commentary on Aristotle. Substance was identified as the supreme genus or the most general supertype Adopted from John Sowa

20 UMBC an Honors University in Maryland 20 Blocks world  The blocks world is a “microworld” used for NLP, vision, planning.  It consists of a table, a set of blocks or different shapes, sizes and colors and a robot hand.  Some typical domain constraints:  Only one block can be on another block.  Any number of blocks can be on the table.  The hand can only hold one block.  Typical representation: ontable(a) ontable(c) green(a) on(b,a) handempty blue(b) clear(b clear(c) red(c) A B C TABLE

21 UMBC an Honors University in Maryland 21 Importance of ontologies in communication  An example of the importance of ontologies is the fate of NASA’s Mars Climate Orbiter  It crashed into Mars on 9/23/1999  JPL used metric units in the program controlling thrusters & Lockheed-Martin used imperial units.  Instead of establishing an orbit at an altitude of 140km, it did so at 60km, causing it to burn up in the Martian atmosphere.  A richer representation would have avoided this.

22 UMBC an Honors University in Maryland 22 Implicit vs. Explicit Ontologies  Systems which communicate or cooperate must share an ontology  The shared ontology can be implicit or explicit  Implicit ontology are common and typically rep-resented only by procedures and data structures  Explicit ontologies are (ideally) given a declarative representation in a well defined knowledge representation (KR) language  Explicit ontologies enable tools and programs to (partially) understand descriptions in ontologies if they understand the KR language.  And declarative languages offer more opportunities for reasoning at an abstract level

23 UMBC an Honors University in Maryland 23 Conceptualizations, Vocabularies and Axiomitization  Three important aspects to explicit ontologies  Conceptualization: modeling domain in terms of objects, attributes and relations.  Vocabulary: assigning symbols or terms to refer to those objects, attributes and relations.  Axiomitization: encoding rules and constraints to capture significant aspects of the domain model.  Ontologies for the same domain may differ in any or all of these three levels.

24 UMBC an Honors University in Maryland 24 Simple examples fruit pommecitronorange fruit applelemonorange fruit applecitruspear limelemonorange fruit tropical temperate

25 UMBC an Honors University in Maryland 25 What kind of Ontologies? from controlled vocabularies to Cyc Catalog/ID General Logical constraints Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restriction Disjointness, Inverse, part of… After Deborah L. McGuinness (Stanford) Simple Taxonomies Expressive Ontologies Wordnet CYC RDFDAML OO DB SchemaRDFS IEEE SUOOWL UMLS

26 UMBC an Honors University in Maryland 26 Little and Big Ontologies  Ontologies come in all sizes but there does seem to be a split between those that like them huge and those that prefer small composable ones.  Small ontologies include: Dublin Core, FOAF, etc.  Some large, general ontologies are freely available:  Cyc - Original general purpose ontology  OntoSem – UMBC’s lexical KR system and ontology  WordNet - a large, on-line lexical reference system  World Fact Book -- 5Meg of KIF sentences of geo-political facts  UMLS - NLM’s Unified Medical Language System  SUMO – Standard Upper Merged Ontology

27 UMBC an Honors University in Maryland 27 Dublin Core: an example of a simple ontology  Developed by an OCLC workshop in Dublin ~95 as a metadata stan- dard for digital library resources on web.  15 core attributes   Neutral on representation  Available in several forms, including an RDF schema (http://purl.org/dc/elements/1.1/) 15 DC elements Content elements  Coverage  Description  Relation  Source  Subject  Title  Type Intellectual Property  Contributor  Creator  Publisher  Right Instantiation  Date  Format  Identifier  Language

28 UMBC an Honors University in Maryland 28 Cyc Cyc  CYC is a large KB which has been under continual development since ~1985.  It is a formalized representation of a vast quantity of fundamental human knowledge: facts, rules of thumb and heuristics for reasoning about objects and events of everyday life.  CYC is encoded in the KR language CYCL  Open Cyc has ~6K concepts and ~60K assertions “capturing the most general concepts of human consensus reality”.  See and

29 UMBC an Honors University in Maryland 29 Cyc’s top level concepts

30 UMBC an Honors University in Maryland 30 OntoSem ontology for Language Understanding  UMBC’s OntoSem is a large ontology and KR system for natural language understand- ing tasks  Browse online at  Intended to represent meaning of NL text and guide its compu- tation

31 UMBC an Honors University in Maryland 31 WordNet  WordNet® is an lexical ontology inspired by psycholinguistic theories of human lexical memory.  English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept.  Synsets: {board,plank}{board,committee}  Different relations link synonym sets (e.g. antonyms, generalizations, etc)  ~140K words  Developed by the Cognitive Science Laboratory at Princeton and available in many forms  Although linguistically motivated, many groups have used it as a general ontology of concepts. 

32 UMBC an Honors University in Maryland 32 IEEE Standard Upper Ontology  An IEEE standards working group  “This standard will specify an upper ontology that will enable computers to utilize it for applications such as data interoperability, information search and retrieval, automated inferencing, and natural language processing.   See site for documents and archives of mailing list discussions  Two “starter documents” for SUOs: SUMO ( and IFF

33 UMBC an Honors University in Maryland 33 World Fact Book  Stanford’s WFB aims to semi- automatically construct a substantial KB of basic geographic, economic, political, and demographic knowledge about the world's nations.  Source: CIA World Fact Book  5.2 MB ~ 5K classes & 64K facts and rules encoded in KIF  Available from svc.stanford.edu:5915/doc/wfb/ in several formshttp://www-ksl- svc.stanford.edu:5915/doc/wfb/  Example: resources, industries, commodities  Interrelated: crude-oil reserves, production, exports  Coal mining,computer industry,auto parts industry, …  Specify basic definitions  A natural resource is a deposit of stuff; an industry is a collection of businesses; a commodity is an item whose sales can be measured as a continuous quantity  Examine related classes & identify key factors  E.g., material, process, product, customer, location, task  Define each industry as a conjunction of factors  6 generative factors discriminate 500 industries  Organize values of factors (mining

34 UMBC an Honors University in Maryland 34 UMLS: Unified Medical Language System  Under development since 1986 by the National Library of Medicine  Supports standardize medical terminology via a central dictionary + thesaurus + semantic network+ search engine  Purpose is to “aid the development of systems that help health professionals and researchers retrieve and integrate electronic biomedical information from a variety of sources and to make it easy for users to link disparate information systems, including computer-based patient records,bibliographic databases, factual databases, and expert systems”.  There are four UMLS knowledge sources:  UMLS Metathesaurus  SPECIALIST Lexicon  UMLS Semantic Network  UMLS Information Sources Map

35 UMBC an Honors University in Maryland 35 Ontology Languages  Ontologies are thought of as a kind of knowledge base  Tools for representing and using ontologies are typically knowledge representation languages  These come in many forms (rule based systems, frame based systems, FOL theorem provers, etc.)  In the rest of our tutorial we will focus on languages developed for the semantic web (RDF, OWL), because  They offer an interesting mix of theory and practice  They are becoming widely used  Their characteristics are good for mobile and pervasive systems.

36 UMBC an Honors University in Maryland 36 Ontology Conclusions  Shared ontologies are essential for increasing auto- mation (agents, autonomic computing, language understanding, etc.)  Ontology tools and standards are important  Good research has been done and is ready for exploitation  RDF and OWL will get ontologies out of the lab  Small ontologies are in use today  See next section on the semantic web  And large general ontologies are available  Cyc, WFB, WordNet, …

37 UMBC an Honors University in Maryland 37 3 The semantic web

38 UMBC an Honors University in Maryland 38 Overview  Introduction  Opening thoughts, Motivation, History  Languages  RDF, RDFS, OWL  Tools  Editors, APIs, reasoners, …  Applications  RSS, FOAF, Web sites, agents, IR, …  On the research frontier  Open problems, current research, …  Closing  Speculations, for more info

39 UMBC an Honors University in Maryland 39 “XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria.” -- Philip Wadler, Et tu XML? The fall of the relational empire, VLDB, Rome, September 2001.

40 UMBC an Honors University in Maryland 40 “The web has made people smarter. We need to understand how to use it to make machines smarter, too.” -- Michael I. Jordan, paraphrased from a talk at AAAI, July 2002 by Michael Jordan (UC Berkeley)

41 UMBC an Honors University in Maryland 41 “The Semantic Web will globalize KR, just as the WWW globalize hypertext” -- Tim Berners-Lee

42 UMBC an Honors University in Maryland 42 “The multi-agent systems paradigm and the web both emerged around One has succeeded beyond imagination and the other has not yet made it out of the lab.” -- Anonymous, 2001

43 UMBC an Honors University in Maryland 43 IOHO  The web is like a universal acid, eating through and consuming everything it touches.  Web principles and technologies are equally good for wireless/pervasive computing.  The semantic web is our first serious attempt to provide semantics for XML sublanguages.  It will provide mechanisms for people and machines (agents, programs, web services) to come together.  In all kinds of networked environments: wired, wireless, ad hoc, wearable, etc.

44 UMBC an Honors University in Maryland 44 Origins Tim Berners-Lee’s original 1989 WWW proposal described a web of relationships among named objects unifying many info. management tasks. Capsule history  Guha’s MCF (~94)  XML+MCF=>RDF (~96)  RDF+OO=>RDFS (~99)  RDFS+KR=>DAML+OIL (00)  W3C’s SW activity (01)  W3C’s OWL (03) TBL

45 UMBC an Honors University in Maryland 45 W3C’s Semantic Web Goals Focus on machine consumption: "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Berners-Lee, Hendler and Lassila, The Semantic Web, Scientific American, 2001

46 UMBC an Honors University in Maryland 46 TBL’s semantic web vision

47 UMBC an Honors University in Maryland 47 Why is this hard? after Frank van Harmelen and Jim Hendler

48 UMBC an Honors University in Maryland 48 What a web page looks like to a machine… after Frank van Harmelen and Jim Hendler

49 UMBC an Honors University in Maryland 49 OK, so HTML is not helpful Maybe we can tell the machine what the different parts of the text represent? title time speaker location abstract biosketch host

50 UMBC an Honors University in Maryland 50 XML to the rescue? XML fans propose creating a XML tag set to use for each application. For talks, we can choose,, etc. after Frank van Harmelen and Jim Hendler

51 UMBC an Honors University in Maryland 51 XML  machine accessible meaning But, to your machine, the tags still look like this…. The tag names carry no meaning. XML DTDs and Schemas have little or no semantics. after Frank van Harmelen and Jim Hendler             

52 UMBC an Honors University in Maryland 52 XML Schema helps XML Schemas provide a simple mechanism to define shared vocabularies.                             XML Schema file after Frank van Harmelen and Jim Hendler

53 UMBC an Honors University in Maryland 53 But there are many schemas               XML Schema file 1 XML Schema file 42 after Frank van Harmelen and Jim Hendler

54 UMBC an Honors University in Maryland 54 There’s no way to relate schema               XML Schema file 1 XML Schema file 42 Either manually or automatically. XML Schema is weak on semantics.

55 UMBC an Honors University in Maryland 55 An Ontology level is needed XML Ontology 1 XML Ontology 42 We need a way to define ontologies in XML So we can relate them So machines can understand (to some degree) their meaning XML Ontology 256 imports = <> Ontologies add Structure Constraints mappings

56 UMBC an Honors University in Maryland 56 What kind of Ontologies? from controlled vocabularies to Cyc Catalog/ID General Logical constraints Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restriction Disjointness, Inverse, part of… After Deborah L. McGuinness (Stanford) Simple Taxonomies Expressive Ontologies Wordnet CYC RDFDAML OO DB SchemaRDFS IEEE SUOOWL UMLS

57 UMBC an Honors University in Maryland 57 Dublin Core: a simple ontology  Developed by an OCLC workshop in Dublin ~95 as a metadata standard for digital library resources on web  15 core attributes   Neutral on representation  Available as an RDF schema 15 DC elements Content elements  Coverage  Description  Relation  Source  Subject  Title  Type Intellectual Property  Contributor  Creator  Publisher  Right Instantiation  Date  Format  Identifier  Language

58 UMBC an Honors University in Maryland 58 Cyc – a complex ontology  Cyc is a large, general purpose ontology with reasoning engine developed since ~1983 by MCC and Cycorp  Cyc KB has > 100k terms.  Terms are axiomatized by > 1M handcrafted assertions  Cyc inference engine has > 500 heuristic level modules  Goal: encode “common sense” knowledge for general applications (e.g., NLP)  Cyc in OWL:

59 UMBC an Honors University in Maryland 59 Today and tomorrow  Simple ontologies like FOAF & DC in use today  We’ve crawled more than 1M FOAF RDF files  We hope to be able to make effective use ontologies like Cyc in the coming decade  There are skeptics …  It’s a great research topic …  The SW community has a roadmap and some experimental languages …  Industry is still holding back…  They are being conservative  We need more experimentation and exploration

60 UMBC an Honors University in Maryland 60 The Semantic Web Wave

61 UMBC an Honors University in Maryland 61 Semantic Web Languages

62 UMBC an Honors University in Maryland 62 Semantic web languages today  Today there are, IOHO, three semantic web languages  RDF – Resource Description Framework  DAML+OIL – Darpa Agent Markup Language (deprecated)  OWL – Ontology Web Language  Topic maps (http://topicmaps.org/) are another species, not based on RDF  with more to come? ….

63 UMBC an Honors University in Maryland 63 RDF is the first SW language XML Encoding Graph stmt(docInst, rdf_type, Document) stmt(personInst, rdf_type, Person) stmt(inroomInst, rdf_type, InRoom) stmt(personInst, holding, docInst) stmt(inroomInst, person, personInst) Triples RDF Data Model Good for Machine Processing Good For Human Viewing Good For Reasoning RDF is a simple language for building graph based representations

64 UMBC an Honors University in Maryland 64 The RDF Data Model  An RDF document is an unordered collection of statements, each with a subject, predicate and object (aka triples)  A triple can be thought of as a labelled arc in a graph  Statements describe properties of web resources  A resource is any object that can be pointed to by a URI:  a document, a picture, a paragraph on the Web, …  E.g.,  a book in the library, a real person (?)  isbn:// ……  Properties themselves are also resources (URIs)

65 UMBC an Honors University in Maryland 65 URIs are a foundation  URI = Uniform Resource Identifier  "The generic set of all names/addresses that are short strings that refer to resources"  URLs (Uniform Resource Locators) are a subset of URIs, used for resources that can be accessed on the web  URIs look like “normal” URLs, often with fragment identifiers to point to a document part:   URIs are unambiguous, unlike natural language terms  the web provides a global namespace  We assume references to the same URI are to the same thing

66 UMBC an Honors University in Maryland 66 What does a URI mean?  Sometimes URIs denote a web resource  denotes a file  We can use RDF to make assertions about the resource, e.g., it’s an image and depicts a person with name Tim Finin, …  Sometimes URIs denote concepts in the external world  E.g., denotes a particular University located in Baltimore  This is done by social convention

67 UMBC an Honors University in Maryland 67 The RDF Graph  An RDF document is an unordered collection of triples  The subject of one triple can be the object of another  So the result is a directed, labelled graph  A triple’s object can also be a literal, e.g., a string.

68 UMBC an Honors University in Maryland 68 Simple RDF Example ~finin/talks/idm02/ “Intelligent Information Systems on the Web and in the Aether” dc:Title dc:Creator bib:Aff “Tim Finin” bib:name bib:

69 UMBC an Honors University in Maryland 69 XML encoding for RDF

70 UMBC an Honors University in Maryland 70 N triple representation  RDF can be encoded as a set of triples.. "Intelligent Information Systems on the Web and in the Aether". _:j10949 "Tim Finin". _:j10949 _:j  Note the gensym for the anonymous node _:j10949

71 UMBC an Honors University in Maryland 71 Triple Notes  RDF triples have one of two forms:   Triples are also easily mapped into logic  becoming:  (, )  With type(, ) becoming ( )  Example:  subclass(man,person)  sex(man,male)  domain(sex,animal)  man(adam)  age(adam,100)  Triples are easily stored and managed in DBMS  Flat nature of a triple a good match for relational DBs ; Note: we’re not ; showing the actual ; URIs for clarity

72 UMBC an Honors University in Maryland 72 N3 notation for RDF  N3 is a compact notation for RDF that is easier for people to read, write and edit.  Aka Notation 3, developed by TBL himself.  Translators exist between N3 and the XML encoding, such as the web form on   So, it’s just “syntactic sugar”  But, XML is largely unreadable and even harder to write

73 UMBC an Honors University in Maryland 73 N3 rdf: dc: bib: dc:title "Intelligent Information Systems on the Web and in the Aether" ; dc:creator [ bib:Name "Tim Finin"; bib: bib:Aff: "http://umbc.edu/" ].

74 UMBC an Honors University in Maryland 74 A usecase: FOAF  FOAF (Friend of a Friend) is a simple ontology to describe people and their social networks.  See the foaf project page:  We recently crawled the web and discovered over 1,000,000 valid RDF FOAF files.  Most of these are from the blogging system which encodes basic user info in foaf  See Tim Finin 2410…37262c252e

75 UMBC an Honors University in Maryland 75 FOAF Vocabulary Basics Agent Person name nick title homepage mbox mbox_sha1sum img depiction depiction (depicts)depicts surname family_name givenname firstName Personal Info weblog knows interest currentProject pastProject plan based_near workplaceHomepage workInfoHomepage schoolHomepage topic_interest publications geekcode myersBriggs dnaChecksum Documents & Images Document Image PersonalProfileDocum ent topictopic (page)page primaryTopic tipjar sha1 mademade (maker)maker thumbnail logo Projects & Groups Project Organization Group member membershipClass fundedBy theme Online Accts OnlineAccount OnlineChatAccount OnlineEcommerceAccount OnlineGamingAccount holdsAccount accountServiceHomepage accountName icqChatID msnChatID aimChatID jabberID yahooChatID

76 UMBC an Honors University in Maryland 76 FOAF: why RDF? Extensibility!  FOAF vocabulary provides 50+ basic terms for making simple claims about people  FOAF files can use other RDF terms too: RSS, MusicBrainz, Dublin Core, Wordnet, Creative Commons, blood types, starsigns, …  RDF guarantees freedom of independent extension  OWL provides fancier data-merging facilities  Result: Freedom to say what you like, using any RDF markup you want, and have RDF crawlers merge your FOAF documents with other’s and know when you’re talking about the same entities. After Dan Brickley,

77 UMBC an Honors University in Maryland 77 No free lunch! Consequence:  We must plan for lies, mischief, mistakes, stale data, slander  Dataset is out of control, distributed, dynamic  Importance of knowing who-said-what  Anyone can describe anyone  We must record data provenance  Modeling and reasoning about trust is critical  Legal, privacy and etiquette issues emerge  Welcome to the real world After Dan Brickley,

78 UMBC an Honors University in Maryland 78 More RDF Vocabulary  RDF has terms for describing lists, bags, sequences, etc.  RDF also can describe triples through reification  Enabling statements about statements :john bdi:believes _:s. _:s rdf:type rdf:Statement. _:s rdf:subject. _:s rdf:predicate cat:salePrice. _:s rdf:object "19.95".

79 UMBC an Honors University in Maryland 79 RDF is being used!  RDF has a solid specification  RDF is being used in a number of web standards  CC/PP (Composite Capabilities/Preference Profiles) CC/PP  P3P (Platform for Privacy Preferences Project) P3P  RSS (RDF Site Summary) RSS  RDF Calendar (~ iCalendar in RDF) RDF Calendar  And in other systems  Netscape’s Mozilla web browser  Open directory (http://dmoz.org/)  Adobe products via XMP (eXtensible Metadata Platform)  Web communities: LiveJournal, Ecademy, and Cocolog.LiveJournalEcademyCocolog  We’ve found over 1.6M RDF documents on the web.

80 UMBC an Honors University in Maryland 80 RDF Schema (RDFS)  RDF Schema adds taxonomies for classes & properties  subClass and subProperty  and some metadata.  domain and range constraints on properties  Several widely used KB tools can import and export in RDFS Stanford Protégé KB editor Java, open sourced extensible, lots of plug-ins provides reasoning & server capabilities

81 UMBC an Honors University in Maryland 81 RDFS Vocabulary  Terms for classes  rdfs:Class rdfs:Class  rdfs:subClassOf rdfs:subClassOf  Terms for properties  rdfs:domain rdfs:domain  rdfs:range rdfs:range  rdfs:subPropertyOf rdfs:subPropertyOf  Special classes  rdfs:Resource rdfs:Resource  rdfs:Literal rdfs:Literal  rdfs:Datatype rdfs:Datatype  Terms for collections  rdfs:member rdfs:member  rdfs:Container rdfs:Container  rdfs:ContainerMembers hipProperty rdfs:ContainerMembers hipProperty  Special properties  rdfs:comment rdfs:comment  rdfs:seeAlso rdfs:seeAlso  rdfs:isDefinedBy rdfs:isDefinedBy  rdfs:label rdfs:label RDFS introduces the following terms and gives each a meaning w.r.t. the rdf data model

82 UMBC an Honors University in Maryland 82 RDF and RDF Schema u:Chair John Smith rdf:type g:name John Smith g:Person g:name rdfs:Classrdfs:Property rdf:type rdfs:subclassOf rdfs:domain

83 UMBC an Honors University in Maryland 83 RDFS Classes and Resources  RDFS defines the terms of resources and classes:  everything in RDF is a “resource”  “classes” are also resources, but…  they are also a collection of possible resources (i.e., individuals)  Relationships are defined among resources:  “typing”: an individual belongs to a specific class  “subclassing”: instance of one is also the instance of the other  As in object-based programming, but the same resource can have several types  “Type”, “subclass” are simple statements on resources  resources can be identified by URIs  i.e., these statements can be described in RDF, too!

84 UMBC an Honors University in Maryland 84 Properties in RDFS  Property is a special class (rdf:Property)  i.e., properties are also resources  Properties are constrained by their range and domain  i.e., what individuals can be on the “left” or on the “right”  E.g., parentOf is a property with domain=person and range=person  There is also a possibility for a “sub-property”  E.g., fatherOf is a subProperty of ParentOf

85 UMBC an Honors University in Maryland 85 Properties in RDFS  Properties are also resources…  So properties of properties can be expressed as… RDF properties  E.g.: the range of the property parentOf is Person  The RDF statement P rdfs:range C  Has subject=P, predicate=rdf:range, object=C  And means:  P is a property  C is a class instance  when using P, the “object” must be an individual in C

86 UMBC an Honors University in Maryland 86 RDFS supports simple inferences  An RDF ontology plus some RDF statements may imply additional RDF statements.  This is not true of XML.  Note that this is part of the data model and not of the accessing or processing :. parent rdfs:domain person; rdfs:range person. mother rdfs:subProperty parent; rdfs:range person. eve mother cain. parent a class. person a property. woman subClass person. mother a property. eve a person; a woman; parent cain. cain a person. New and Improved! 100% Better than XML!! New and Improved! 100% Better than XML!!

87 UMBC an Honors University in Maryland 87 N3 :. <> rdfs:comment “This is an N3 example”. :Person a rdfs:Class. :Woman a rdfs:Class; rdfs:subClassOf :Person. :eve a :Woman; :age “100”. :sister a rdf:Property; rdfs:domain :Person; rdfs:range :Woman. :eve :sister [a :Woman; :age 98]. :eve :believe {:eve :age “100”}. [is :spouse of [is :sister of :eve]] :age 99. :eve.:sister.:spouse :age 99. This defines the “empty prefix” as refering to “this document” Here’s how you declare a namespace. <> Is an alias for the URI of this document. “person is a class”. The “a” syntax is sugar for rdf:type property. “Woman is a class and a subclass of person”. Note the ; syntax. “eve is a woman whose age is 100.” “sister is a property from person to woman” “eve has a sister who is a 98 year old woman”. The brackets introduce an anonymous resource. “eve believes that her age is 100”. The braces introduce a reified triple. “the spouse of the sister of eve is 99”.

88 UMBC an Honors University in Maryland 88 Is RDF(S) better than XML? Q: For a specific application, should I use XML or RDF? A: It depends…  XML's model is  a tree, i.e., a strong hierarchy  applications may rely on hierarchy position  relatively simple syntax and structure  not easy to combine trees  RDF's model is  a loose collections of relations  applications may do “database”-like search  not easy to recover hierarchy  easy to combine relations in one big collection  great for the integration of heterogeneous information

89 UMBC an Honors University in Maryland 89 From where will the markup come?  A few authors will add it manually.  More will use annotation tools.  SMORE: Semantic Markup, Ontology and RDF Editor  Intelligent processors (e.g., NLP) can understand documents and add markup (hard)  Machine learning powered information extraction tools show promise  Lots of web content comes from databases & we can generate SW markup along with the HTML  See

90 UMBC an Honors University in Maryland 90 From where will the markup come?  In many tools, part of the metadata information is present, but thrown away at output  e.g., a business chart can be generated by a tool…  …it “knows” the structure, the classification, etc. of the chart  …but, usually, this information is lost  …storing it in metadata is easy!  So “semantic web aware” tools can produce lots of metadata  E.g., Adobe’s use of its XMP platform

91 UMBC an Honors University in Maryland 91 Problems with RDFS  RDFS too weak to describe resources in sufficient detail, e.g.:  No localised range and domain constraints Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants  No existence/cardinality constraints Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents  No transitive, inverse or symmetrical properties Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical  We need RDF terms providing these and other features.

92 UMBC an Honors University in Maryland 92 We’re going down a familiar road KR trends  55-65: arbitrary data structures  65-75: semantic networks  75-85: simple frame systems  85-95: description logics  95-??: logic?, rules? Web trends  95-97: XML as arbitrary structures  97-98: RDF  98-99: RDFS (schema) as a frame-like system  00-01: DAML+OIL  02-??: OWL…???... Only much faster!

93 UMBC an Honors University in Maryland 93 DAML+OIL = RDF + KR  DAML = Darpa Agent Markup Language  DARPA program with 17 projects & an integrator developing language spec, tools, applications for SW.  OIL = Ontology Inference Layer  An EU effort aimed at developing a layered approach to representing knowledge on the web.  Process  Joint Committee: US DAML and EU Semantic Web Technologies participants  DAML+OIL specs released in 2001  See  Includes model theoretic and axiomatic semantics DAML+OIL

94 UMBC an Honors University in Maryland 94 W3C’s Web Ontology Language (OWL)  DAML+OIL begat OWL.  OWL released as W3C recommendation 2/10/04  See for OWL overview, guide, specification, test cases, etc.  Three layers of OWL are defined of decreasing levels of complexity and expressiveness  OWL Full is the whole thing  OWL DL (Description Logic) introduces restrictions  OWL Lite is an entry level language intended to be easy to understand and implement OWL

95 UMBC an Honors University in Maryland 95 OWL  RDF  An OWL ontology is a set of RDF statements  OWL defines semantics for certain statements  Does NOT restrict what can be said -- documents can include arbitrary RDF  But no OWL semantics for non-OWL statements  Adds capabilities common to description logics:  cardinality constraints, defined classes (=> classification), equivalence, local restrictions, disjoint classes, etc.  More support for ontologies  Ontology imports ontology, versioning, …  But not (yet) variables, quantification, & rules  A complete OWL reasoning is significantly more complex than a complete RDFS reasoner.

96 UMBC an Honors University in Maryland 96 Owl is based on Description Logic  Include a few slides here on the basics of DL

97 UMBC an Honors University in Maryland 97 OWL Class Constructors borrowed from Ian Horrocks

98 UMBC an Honors University in Maryland 98 OWL Axioms borrowed from Ian Horrocks

99 UMBC an Honors University in Maryland 99 OWL Language  Three species of OWL  OWL Full is union of OWL syntax and RDF  OWL DL restricted to FOL fragment (  DAML+OIL)  OWL Lite is “simpler” subset of OWL DL  Semantic layering  OWL DL  OWL full within DL fragment  OWL DL based on SHIQ Description Logic  OWL DL Benefits from many years of DL research  Well defined semantics  Formal properties well understood (complexity, decidability)  Known reasoning algorithms  Implemented systems (highly optimised)

100 UMBC an Honors University in Maryland 100 OWL Lite Features  RDF Schema Features  Class, rdfs:subClassOf, Individual  rdf:Property, rdfs:subPropertyOf  rdfs:domain, rdfs:range  Equality and Inequality  sameClassAs, samePropertyAs, sameIndividualAs  differentIndividualFrom  Restricted Cardinality  minCardinality, maxCardinality (restricted to 0 or 1)  cardinality (restricted to 0 or 1)  Property Characteristics  inverseOf, TransitiveProperty, SymmetricProperty  FunctionalProperty(unique), InverseFunctionalProperty  allValuesFrom, someValuesFrom (universal and existential local range restrictions)  Datatypes  Following the decisions of RDF Core.  Header Information  imports, Dublin Core Metadata, versionInfo

101 UMBC an Honors University in Maryland 101 OWL Features  Class Axioms  oneOf (enumerated classes)  disjointWith  sameClassAs applied to class expressions  rdfs:subClassOf applied to class expressions  Boolean Combinations of Class Expressions  unionOf  intersectionOf  complementOf  Arbitrary Cardinality  minCardinality  maxCardinality  cardinality  Filler Information  hasValue Descriptions can include specific value information

102 UMBC an Honors University in Maryland 102 OWL Ontologies  The owl:Ontology class describes an ontology  An ontology file should be one instance of owl:Ontology  Ontology properties include  owl:imports, owl:versionInfo, owl:priorVersion  owl:backwardCompatibleWith, owl:incompatibleWith  rdfs:label, rdfs:comment can also be used  Deprecation control classes:  owl:DeprecatedClass, owl:DeprecatedProperty types

103 UMBC an Honors University in Maryland 103 Classes in OWL  In RDFS, you can subclass existing classes…  … but, otherwise, that is all you can do  In OWL, you can construct classes from existing ones by  enumerate its members  through intersection, union, complement of other classes  through property restrictions  To do so, OWL introduces its own Class…  … and Thing to differentiate the individuals from the classes

104 UMBC an Honors University in Maryland 104 A Simple OWL Example  Note the mixture of rdf (plant and animal are classes) and OWL (plant and animal are disjoint)

105 UMBC an Honors University in Maryland 105 OWL in One Slide xmlns:owl="http://www.w3.org/2002/07/owl#”> Finin is a person. OWL is built on top of XML and RDF It can be used to add metadata about anything which has a URI. everything has a URI OWL is ~= a frame based knowledge representation language It allows the definition, sharing, composition and use of ontologies URIs are a W3C standard generalizing URLs

106 UMBC an Honors University in Maryland 106 Semantic Web Services

107 UMBC an Honors University in Maryland 107 Semantic Web Services  A few slides on semantic web services

108 UMBC an Honors University in Maryland 108 Semantic web applications

109 UMBC an Honors University in Maryland 109 Two kinds of systems  RDF is being used to support many practical, useful applications  E.g., RSS, CCPP, P3P, FOAF tools  RDF, and OWL are being experimented with in many research prototypes  We’ll describe some research at UMBC  SWAD-Europe survey:   lists more than 50 applications in 12 categories…

110 UMBC an Honors University in Maryland 110 RSS  Rich Site Summary or RDF Site Summary  A lightweight multipurpose extensible metadata description & syndication format for the web  news & other headline syndication  weblog syndication  propagation of software update lists. UMBC AgentWeb UMBC AgentWeb... en-us copyright... Tim... Mon... (PICS UMBC AgentWeb UDDIe the...

111 UMBC an Honors University in Maryland 111 CCPP  Composite Capabilities/Preference Profiles  RDF-based W3C recommended standard for customizing web content for devices and users  It is a Client profile data format  For describing device capabilities and user preferences  Enables adaptation of content presented to that device  It is not a standard explaining how the profile is transferred, or what attributes must be generated or recognized

112 UMBC an Honors University in Maryland 112 P3P: Platform for Privacy Preferences Project A W3C standard Web sites publish privacy practices in a standard computer-readable format. Enables tools built into browsers or separate applications that summarize privacy policies, compare privacy policies with user preferences, alert and advise users. Doesn’t require web-sites to change their server software. P3P support is in IE, Netscape

113 UMBC an Honors University in Maryland 113 ITTALKS ITTALKS is a database driven web site of IT related talks at UMBC and other institutions. The database contains information on –Seminar events –People (speakers, hosts, users, …) –Places (rooms, institutions, …) Web pages with DAML markup are generated The DAML markup supports agent-based services relating to these talks.  Users get talk announcements based on the interests, locations and schedules.

114 UMBC an Honors University in Maryland 114 human view

115 UMBC an Honors University in Maryland 115 machine view

116 UMBC an Honors University in Maryland 116 ITTALKS Architecture Web server + Java servlets DAML reasoning engine DAML files Agents Databases People RDBMS DB , HTML, SMS, WAP FIPA ACL, KQML, DAML SQL HTTP, KQML, DAML, Prolog MapBlast, CiteSeer, Google, … HTTP HTTP, WebScraping Web Services Apache Tomcat

117 UMBC an Honors University in Maryland 117 Travel Agent Game in Agentcities Technologies FIPA (JADE, April Agent Platform) Semantic Web (RDF, OWL) Web (SOAP,WSDL,DAML-S) Internet (Java Web Start ) Features Open Market Framework Auction Services OWL message content OWL Ontologies Global Agent Community Acknowledgements: DARPA contract F and Fujitsu Laboratories of America. Students: Y. Zou, L. Ding, H. Chen, R. Pan. Faculty: T. Finin, Y. Peng, A. Joshi, R. Cost. 4/03 Motivation Market dynamics Auction theory (TAC) Semantic web Agent collaboration (FIPA & Agentcities) Travel Agents Auction Service Agent Customer Agent Bulletin Board Agent Market Oversight Agent Request Direct Buy Report Direct Buy Transactions Bid CFP Report Auction Transactions Report Travel Package Report Contract Proposal Web Service Agents Ontologies travel.owl – travel concepts fipaowl.owl – FIPA content lang. auction.owl – auction services tagaql.owl – query language FIPA platform infrastructure services, including directory facilitators enhanced to use DAML-S for service discovery

118 UMBC an Honors University in Maryland 118  Our research group’s web site generate both HTML and OWL.  HOW? This is relatively easy since the content is in a database.  PHP is sufficient for the job.  HTML pages have links to corresponding OWL  WHY? This exposes the information to programs and agents – no more web scraping.

119 UMBC an Honors University in Maryland mobile & pervasive computing uses

120 UMBC an Honors University in Maryland 120 To be supplied

121 UMBC an Honors University in Maryland example ontologies

122 UMBC an Honors University in Maryland 122 To be supplied

123 UMBC an Honors University in Maryland Ontology Engineering Adapted from “Ontology Development 101: A Guide to Creating Your First Ontology” by Natalya F. Noy and Deborah L. McGuinness

124 UMBC an Honors University in Maryland 124 Engineering Ontologies vs. …  Designing an ontology is different from designing a DB schema or a OO design  Ontologies tend to  reflect the structure of the world and/or the way we think about it  Focus on the abstract structure of concepts  Ignore the “physical” organization of information  DB or OO schemas tend to  Reflect the structure of the data, code, methods, …  Be concerned about the physical representation (int, double)  May be optimized for certain access assumptions  Ontology design is similar to designing a database schema  But using a richer representation language

125 UMBC an Honors University in Maryland 125 Ontology-Development Process Noy and McGuinnes recommend the following process determine scope consider reuse enumerate terms define classes define properties define constraints create instances Done iteratively, of course… determine scope consider reuse enumerate terms define classes consider reuse enumerate terms define classes define properties create instances define classes define properties define constraints create instances define classes consider reuse define properties define constraints create instances

126 UMBC an Honors University in Maryland 126 Determine Domain and Scope Here are some questions to consider:  What is the domain that the ontology will cover?  For what we are going to use the ontology?  Will it be reused for different purposes?  Do we expect it to be extended?  What types of questions should the information in the ontology answer? Of course, the answers may change during the lifecycle determine scope consider reuse enumerate terms define classes define properties define constraints create instances

127 UMBC an Honors University in Maryland 127 Consider Reuse  Why reuse other ontologies?  to save the effort  To avoid maintenance problems of redundant models  to interact with the tools that use other ontologies  to use ontologies that have been validated through use in applications  So, first check to see if you can reuse or extend all or part of another ontology  Check well known upper ontologies, ontology libraries and repositories, search engines  Upper ontologies (e.g., CYC, IEEE SUMO) are specifically designed for reuse determine scope consider reuse enumerate terms define classes define properties define constraints create instances

128 UMBC an Honors University in Maryland 128 Enumerate Important Terms  A good way to start is by writing down the terms that seem natural to use  And then organizing them into terms that seem to refer to  Class of objects (e.g., wine, whiteWine, seaFood)  Properties of objects (e.g., wineColor, winery)  Individual objects or values (e.g., “red”, “ChateauLafiteRothschild”)  Don’t obsess about distinguishing classes from individuals, or properties from classes  There’s time for that later consider reuse determine scope enumerate terms define classes define properties define constraints create instances

129 UMBC an Honors University in Maryland 129 Define Classes and the Class Hierarchy  A class is a concept in the domain and is modeled as a set of individuals  wines, whiteWines, wines made in California, etc.  An instance is an individual and may be a member of many classes  a glass of California wine you’ll have for lunch  Classes are organized into a taxonomy or hierarchy by the subclass (is-a) relation  Most ontology languages support “multiple inheritance” and do not allow shadowing or overriding  Some allow an object to be both a class and an individual  hondaCivic consider reuse determine scope define classes define properties define constraints create instances enumerate terms

130 UMBC an Honors University in Maryland 130 Define Properties of Classes – Slots  Slots in a class definition describe attributes of instances of the class and relations to other instances Each wine will have color, sugar content, producer, etc.  Things to think about: Is the slot defined at the right level?  Note: in most ontology languages, a class or individual inherits all the slots from all of its ancestors  Ontology languages differ as to whether or not slots are considered “first class objects” in their own right  Yes for the SW, no for most frame languages consider reuse determine scope define constraints create instances enumerate terms define classes define properties

131 UMBC an Honors University in Maryland 131 Constraints on properties  Property constraints (aka facets) describe or limit the set of possible values for a slot  Typical constraints: the type of value (e.g., a winery, an integer), the min and max number of values (e.g., exactly one wineColor), an enumerated set of individuals (“male”, “female”)  Some constraints involve pairs of properties (e.g., inverse, functional, …)  Some ontology languages allow other facets on properties, such as a default value. consider reuse determine scope create instances enumerate terms define classes define constraints define properties

132 UMBC an Honors University in Maryland 132 Constraints on classes Some languages allow constraints on classes  Disjointness  The classes male and female are disjoint  e.g., no individual can be a member of both classes  Covering  The classes infant, child, teenager, adult cover their common superClass person  E.g., an individual of type person must belong to one or more of those subclasses  Definitions  Males who are not adults are necessarily boys consider reuse determine scope create instances enumerate terms define classes define constraints define properties

133 UMBC an Honors University in Maryland 133 Create Instances  Ontologies tend to have many classes and few individuals  Unlike DBs which have few classes (tables) & many individuals (tuples)  The individuals tend to be  Enumerated data values (“white”, “rose”, “red”)  Individuals important to the domain (e.g., ChateauLafiteRothschild) consider reuse determine scope create instances enumerate terms define classes define properties define constraints

134 UMBC an Honors University in Maryland 134 Common Problems  Ontology design is part logic and part art  Some problems arise when assumptions are violated  I said all birds can fly, but I forgot about penguins  Cycles in the subClass hierarchy  Some are more heuristics  A classes immediate subclasses should be at the same level of granularity  A class with just one subclass is suspect  A class with too many (10? 20?) subclasses is suspect  Don’t misuse the is-a relationship (e.g., using it for part-of)

135 UMBC an Honors University in Maryland 135 Validation and Evaluation  Ontologies expressed in well defined languages (e.g., OWL) can and should be checked for inconsistencies  Data (e.g., individuals) can be checked against their ontologies for inconsistencies  Evaluating the quality of an ontologies design is another matter  Heuristics help identify areas of concern: too many/few children, named classes that can be instantiated, etc.  Richly axiomatisized ontologies are generally better  Simple taxonomies may not be very valuable

136 UMBC an Honors University in Maryland 136 Ontology Maintenance  Ontology merging  Having two or more overlapping ontology, create a new one  Ontology mapping  Create a mapping between ontologies  Versioning and evolution  Compatibility between different versions of the same ontology  Compatibility between versions of an ontology and instance data

137 UMBC an Honors University in Maryland Nuts and Bolts

138 UMBC an Honors University in Maryland 138 To be supplied  Harry’s material

139 UMBC an Honors University in Maryland Current research topics

140 UMBC an Honors University in Maryland 140 On the Research Frontier  Developing useful upper ontologies  For Time, Space, Services, …  Some standard problems  Ontology alignment and mapping  Learning ontologies  Extending OWL  Adding rules, uncertainty, …  Developing query languages  RDFQuery, DQL, …  Integrations with  Agents, web services, information retrieval, …  Efficient tools  Good applications

141 UMBC an Honors University in Maryland 141 Some OWL Ontologies  Research efforts are developing useful upper ontologies for …  SERVICES: OWL-S is describes properties and capabilities of services   TIME: DAML-time covers temporal concepts and properties common to any formalization of time   SPACE: DAML-spatial covers spatial concepts and properties 

142 UMBC an Honors University in Maryland 142 SWRL Semantic Web Rule Language  There are some simple things that can not be expressed in owl  Example: defining the uncle relation  Many want a rule language extension to fill gap  SWRL proposal   has an abstract syntax, model theory and XML encoding  Allows horn-like rules to be added to an OWL KB  Hootlet is an integrated OWL & SWRL reasoner 

143 UMBC an Honors University in Maryland 143 Uncle in SWRL (partial)  English Your parent’s brothers are your uncles.  Prolog: uncle(X,Y) :- hasparent(X,Z), hasBrother(Z,Y).  Abstract SWRL syntax: Implies( Antecedent(hasParent(I-variable(x1) I-variable(x2)) hasBrother(I-variable(x2) I-variable(x3))) Consequent(hasUncle(I-variable(x1) I-variable(x3))))

144 UMBC an Honors University in Maryland 144 XML Encoding of SWRL Uncle Rule x1 x2 x2 x3 x1 x3

145 UMBC an Honors University in Maryland 145 Travel Agent Game in Agentcities Technologies FIPA (JADE, April Agent Platform) Semantic Web (RDF, OWL) Web (SOAP,WSDL,DAML-S) Internet (Java Web Start ) Features Open Market Framework Auction Services OWL message content OWL Ontologies Global Agent Community Acknowledgements: DARPA contract F and Fujitsu Laboratories of America. Students: Y. Zou, L. Ding, H. Chen, R. Pan. Faculty: T. Finin, Y. Peng, A. Joshi, R. Cost. 4/03 Motivation Market dynamics Auction theory (TAC) Semantic web Agent collaboration (FIPA & Agentcities) Travel Agents Auction Service Agent Customer Agent Bulletin Board Agent Market Oversight Agent Request Direct Buy Report Direct Buy Transactions Bid CFP Report Auction Transactions Report Travel Package Report Contract Proposal Web Service Agents Ontologies travel.owl – travel concepts fipaowl.owl – FIPA content lang. auction.owl – auction services tagaql.owl – query language FIPA platform infrastructure services, including directory facilitators enhanced to use DAML-S for service discovery

146 146 Swoogle is a crawler based search & retrieval system for semantic web documents (SWDs) in RDF, Owl and DAML. It discovers SWDs and computes their metadata and relations, and stores them in an IR system. Contributors include Tim Finin, Anupam Joshi, Yun Peng, R. Scott Cost, Jim Mayfield, Joel Sachs, Pavan Reddivari, Vishal Doshi, Rong Pan, Li Ding, and Drew Ogle. Partial research support was provided by DARPA contract F and by NSF by awards NSF-ITR-IIS and NSF-ITR-IDM May Ontology discovery Ontology discovery Web interface Web interface DB SWD crawler SWD crawler We b Ontology Analyzer Ontology Analyzer Ontology Agents Ontology discovery Ontology discovery Google Apache/ Tomcat php, myAdmin mySQL Jena IR engine SIRE Web services Agent service s cached files Focused Crawler APIs Swoogle uses two kinds of crawlers to discover semantic web documents and several analysis agents to compute metadata and relations among documents and ontologies. Metadata is stored in a relational DBMS. SWD Rank A SWD’s rank is a function of its type (SWO/SWI) and the rank and types of the documents to which it’s related. SWD Properties SWOs SWIs HTML documents Images CGI scripts Audio files Video files The web, like Gaul, is divided into three parts: the regular web (e.g. HTML), Seman- tic Web Ontologies (SWOs), and Semantic Web Instance files (SWIs) SWD = SWO + SWI Binary: R(D1,D2) IM: D1 owl:imports D2 IMstar: transitive closure of IM EX: D1 extends D2 by defining classes or properties subsumed by D2’s PV: owl:priorVersion & subproperties TM: D1 uses terms from D2 IN: D1 uses individual defined in D2 MAP: D1 maps some of its terms to D2’s SIM: D1 & D2 are similar EQ: D1 & D2 are identical EQV: D1 & D2 have the same triples Ternary: R(D1,D2,D3) MP3: D1 maps a term from D2 to D3 using owl:sameClass, etc. SWD Relations Language and level; encoding, number of triples, defined classes, defined properties, & defined individuals; type (SWO, SWI); form (RSS, FOAF, P3P, …); rank; weight; annotations; … Swoogle puts documents into a character n- gram based IR engine to compute document similarity and do retrieval from queries Swoogle v1 has ~12K SWDs & 100K relations. v2 will also catalog classes and properties and their metadata and have >1.6M SWDs. SWD IR Engine

147 147 Knowledge Discovery in the Semantic Web SEMDIS NSF award ITR-IIS U. Georgia, Sheth (PI), Arpinar (CO-PI), Kochut, Miller NSF award ITR-IIS UMBC, Joshi (PI), Yesha (CO-PI), Finin June 2004 Objective Design, prototype and evaluate a system supporting the discovery, indexing and querying of complex semantic relationships in the Semantic Web. The system maintains and utilizes trust and provenance information to enhance the relationship discovery. Approach Knowledge representation systems reason over sem- antic web content discovered on the web which is re- duced to triples that can be efficiently stored and pro- cessed in relational databases. Trust models and heuristics guide the formation of conclusions Broader impacts Techniques and prototypes developed can be applied to a range of problems, including discovering new connections and relations in scientific information and homeland security. SWETO is large ontology covering several test-bed domains. It is pop-ulated with 800K instances and 1.M relations extracted from heterogeneous Web sources. SWETO was developed using Semagix Freedom system. An experimental algorithm has been developed to integrate and rank discovered relationships. Referencefoaf:Agent rdf:Statement selects JustificationTrust Belief Association contains foaf:Document rdf:Resource foaf:page DocumentRelation xsd:real [0,1] AssociationConnective confidence connective source A “web of belief” model and associated ontology is used to represent, integrate, and evaluate conclusions drawn from the large volume of heterogeneous assertions found in the data. A. Joshi L. Ding H. Chen P. Kolari F. Perich Y. Yesha J. Golbeck J. Hendler Kagal sink hub source island FininA. Joshi Ding Chen Kagal Perich Golbeck’s Trust Network DBLP Network FOAF Network A. Sheth M. P. Singh Y. Peng T. Finin mapTo knows

148 148 Research support was provided by NSF, award NSF-ITR-IIS , PI Tim Finin, UMBC. SPIRE Semantic Prototypes in Research Ecoinfomatics Approach We are building prototype tools and applications that demonstrate how semantic web technology supports infor- mation discovery, integration and sharing in scientific com- munities. The National Biological Information Infrastructure (NBII) and Invasive Species Forecasting System (ISFS) pro- vide requirements and serve as testbeds for our prototypes. Significant Results SWOOGLE - a search engine for the semantic web. MoaM (Meal of a Meal) - Given a species list, infer a food web. Photostuff - annotate regions of a picture with OWL. SWOOP - the first ontology editor written specifically for OWL. Ontologies for ecological interaction, and observation data. Food web visualization and analysis tools that are driven by OWL ontologies and instance data. CRISIS CAT - an RDF based catalog of Invasive Species resources in California. Coordination with USGS, NASA, EPA, GBIF, and the Intergovernmental, Interagency Cooperation on Ecoinformatics. Broader Impacts Enable knowledge from one community to be effectively used by another. Harness the power of the citizen scientist. (The majority of invasives are discovered by amateurs.) Integrate research and education in the classroom. Research Team UMBC ebiquity (Finin)UC Davis ICE (Quinn) UMBC GEST Center (Sachs)RMBL PEaCE (Martinez) UMD MINDSWAP (Hendler)NASA GSFC (Schnase) The RMBL team expresses food webs in OWL using an ontology for eco- logical interaction they have constructed in coordination with other ecolo- gists. The OWL model drives the simulation and visualization. Spatial distribution of exotic plants at the Cerro Grande fire site. The statistical techniques used to generate these maps do not take trophic data as input. Yet. Swoogle is a crawler based search and retrieval system for semantic web doc- uments (SWDs) in RDF and OWL. It discovers SWDs and computes their metadata and relations, and stores them in an IR system. Users can search for ontologies or instance data, and hits are ranked according to our Ontology Rank algorithm. Invasive species do more economic damage to the U.S. every year that all other natural disasters combined. Above: plants, animals, and a virus. An ontology (found via Swoogle) is loaded into Photostuff to mark up regions of a field photograph. The NBII California Information Node (CAIN), maintained by UC Davis, is a jumping off point to broader NBII deployment. Coming Soon ELVIS – an end to end application that starts with a location and produces a model of its food web. The Pond Project - a junior high school classroom activity to monitor the health of local ecosystems. Enhanced tools. Spire is a distributed, interdisciplinary research project exploring how semantic web technology supports information discov- ery, integration, and sharing in scientific communities. We are building prototype tools and applications for inclusion in the National Biological Information Infrastructure (NBII), with a focus on the early detection and warning of invasive species. Meal of a Meal (after Friend of a Friend). We know Fish 1 eats Plant 1. We then infer that Fish 1 may also eat the taxonomic siblings of Plant 1: Plants 2 and 3. Similarly, we infer that the taxonomic siblings of Fish 1 - Fishes 2 and 3 - may eat Plant 1. UMBC AN HONORS UNIVERSITY IN MARYLAND

149 UMBC an Honors University in Maryland Closing

150 UMBC an Honors University in Maryland 150 comment  We need to rework the conclusions some to add some for mobile and pervasive computing

151 UMBC an Honors University in Maryland 151 SW is work in progress  There are important language aspects which need more work: rules, queries, etc.  Many tools need to be created, e.g.,  Better editing tools  Annotation tools  Applications need to be explored  SW ideas will migrate into other standards (e.g., basic XML, WSDL)

152 UMBC an Honors University in Maryland 152 Lots of Open Questions  How expressive should the KR language be?  What kind of KR/reasoning system  F.O. logic, logic programming, fuzzy, …  On Web Ontologies  One (e.g. CYC) or many (OWL)  If many, composable (IEEE IFF) or monolithic (IEEE SUMO)  Will general “upper ontologies” (e.g., IEEE SUO) be useful?  Will industry buy in?  Or continue to explore ad hoc XML based solutions  How will it be used?  As markup? As alternative content? Just both machines and people?  Is it good as a content language for agents? => Only experimentation will yield answers. ?

153 UMBC an Honors University in Maryland 153 Speculations  SW might be a chance for us to get intelligent agents out of the lab  Solving the symbol grounding problem  Rethinking agent communication  How do we get there?

154 UMBC an Honors University in Maryland 154 The symbol grounding problem  An argument against human-like AI is that it’s impossible unless machines share our perception of the world.  A solution to this “symbol grounding problem” is to give agents (soft or hard) human inspired senses.  But the world we experience is determined by our senses, and human and machine bodies may lead to different conceptions of the world (cf. Nagel’s What Is It Like To Be a Bat? )  Maybe the Semantic Web is a way out of this problem? MIT’s Cog

155 UMBC an Honors University in Maryland 155 Solving the symbol grounding problem  The web may become a common world that both humans and agents can understand.  Confession: the web is more familiar and real to me than much of the real world.  Physical objects can be tagged with low cost (e.g., $0.05) transponders or RFIDs encoding their URIs  See HP’s Cooltown project

156 UMBC an Honors University in Maryland 156 Rethinking the agent communication  Much multi-agent systems work is grounded in Agent Communication Languages (e.g., KQML, FIPA) and associated software infrastructure.  This paradigm was articulated ~1990, about the same time as the WWW was developed.  Our MAS approach has not yet left the laboratory yet the Web has changed the world.  Maybe we should try something different?  The communication MAS paradigm has been peer-to-peer message oriented communication mediated by brokers and facilitators -- an approach inherited from client-server systems.

157 UMBC an Honors University in Maryland 157 Rethinking the agent communication A possible new paradigm?  Agents “publish” beliefs, requests, and other “speech acts” on web pages.  Brokers “search” for and “index” published content  Agents “discover” what peers have published on the web and browse for more details  Agents “speak for” content on web pages by  Answering queries about them  Accepting comments and assertions about them

158 UMBC an Honors University in Maryland 158 How do we get there from here?  This semantic web emphasizes ontologies – their development, use, mediation, evolution, etc.  It will take some time to really deliver on the agent paradigm, either on the Internet or in a pervasive computing environment.  The development of complex systems is basically an evolutionary process.  Random search carried out by tens of thousands of researchers, developers and graduate students.

159 UMBC an Honors University in Maryland 159 Climbing Mount Improbable “The sheer height of the peak doesn't matter, so long as you don't try to scale it in a single bound. Locate the mildly sloping path and, if you have unlimited time, the ascent is only as formidable as the next step.” -- Richard Dawkins, Climbing Mount Improbable, Penguin Books, 1996.

160 UMBC an Honors University in Maryland 160 T.T.T: things take time  Prior to the 1890’s, papers were held together with straight pens.  The development of “spring steel” allowed the invention of the paper clip in  It took about 25 years (!) for the evolution of the modern “gem paperclip”, considered to be optimal for general use.

161 UMBC an Honors University in Maryland 161 So, we should …  Start with the simple and move toward the complex  E.g., from vocabularies to FOL theories  Allow many ontologies to bloom  Let natural evolutionary processes select the most useful as common consensus ontologies.  Support diversity in ontologies  Monocultures are unstable  There should be no THE ONTOLOGY FOR X.  The evolution of powerful, machine readable ontologies will take many years, maybe generations  Incremental benefits will more than pay for effort

162 UMBC an Honors University in Maryland 162Discussion

163 UMBC an Honors University in Maryland 163 Back Matter

164 UMBC an Honors University in Maryland 164 comment  References and links need to be integrated and have some more on mobile and pervasive computing stuff.

165 UMBC an Honors University in Maryland 165 References and Links Knowledge Representation basics  Ronald Brachman and Hector Levesque, Representation and Reasoning, Morgan Kaufman, May This is a good book that covers a range of knowledge representation issues and approaches.  John Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA, This book has a more philosophical approach. Ontologies in general  Gómez-Pérez, A. (1998). Knowledge sharing and reuse. Handbook of Applied Expert Systems. Liebowitz, editor, CRC Press.  Uschold, M. and Gruninger, M. (1996). Ontologies: Principles, Methods and Applications. Knowledge Engineering Review 11(2) Some very general ontologies  Cyc,  Christiane Fellbaum (Ed), WordNet, An Electronic Lexical Database Edited, Bradford Books, This is a collection of papers on word net and some of its uses. See for access to the wordnet ontology in various forms.

166 UMBC an Honors University in Maryland 166 References and Links Ontologies for mobile and pervasive computing  SOUPA,  Feng Pan and Jerry R. Hobbs, Time in OWL-S, AAAI Spring Symposium on Semantic Web services, April A useful subset of a comprehensive ontology for tim is available at entry.owl Ontology Engineering  Natalya F. Noy and Deborah L. McGuinness (2001) “Ontology Development 101: A Guide to Creating Your First Ontology”  Farquhar, A. (1997). Ontolingua tutorial.

167 UMBC an Honors University in Maryland 167 References and Links Semantic Web  Tim Berners-Lee, James Hendler and Ora Lassila, The Semantic Web : A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities, Scientific American, May C70-84A9809EC588EF21  W3C RDF pages,  W3C OWL pages, Tools  Protégé ontology editor,  Jena Semantic Web Framework,

168 UMBC an Honors University in Maryland 168 Some recommended reading  The Semantic Web, Scientific American, May 2001, Tim Berners-Lee, James Hendler and Ora Lassila The Semantic Web  Integrating applications on the Semantic Web, Jim Hendler, Tim Berners-Lee and Eric Miller, Journal IEE Japan, 122(10): , Integrating applications on the Semantic Web  Ontology Development 101: A Guide to Creating Your First Ontology‘, N. Noy and D. McGuiness, KSL TR KSL-01-05, March Ontology Development 101  A Semantic Web Primer, Grigoris Antoniou (Author), Frank vanHarmelen, MIT Press, July A Semantic Web Primer

169 UMBC an Honors University in Maryland 169 Ontologies: Things to Read  D. McGuinness, Ontologies come of age, 2003Ontologies come of age  J. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Pub. Co., Pacific Grove CA, 2000.Knowledge Representation: Logical, Philosophical, and Computational Foundations  N. Noy, D. McGuinness, Ontology Development 101: A Guide to Creating your First Ontology Ontology Development 101: A Guide to Creating your First Ontology  Lenat and R. Guha, Building Large Knowledge-Based Systems: Representation and Inference in CYC, CACM, pp , Building Large Knowledge-Based Systems: Representation and Inference in CYC

170 UMBC an Honors University in Maryland 170 Tim Finin Tim Finin (http://umbc.edu/~finin/) is a Professor in the Department of Computer Science and Electrical Engineering at the University of Maryland, Baltimore County (UMBC). He has over 30 years of experience in the applications of Artificial Intelligence to problems in information systems, intelligent interfaces and robotics and is currently working on software agents, the semantic web, and mobile computing. He holds degrees from MIT and the University of Illinois and has also held positions at Unisys, the University of Pennsylvania, and the MIT AI Laboratory.

171 UMBC an Honors University in Maryland 171 Harry Chen Harry Chen (http://www.cs.umbc.edu/~hchen4/) is a Computer Science PhD Candidate at the University of Maryland, Baltimore County. Since the senior year of his undergraduate study, Chen has been involved in Pervasive Computing and Artificial Intelligence related research. His Masters thesis work on building a software agent architecture using the Jini technology was presented at the Second Jini Community Meeting in In 2000, 2001 and 2002, he participated in the summer internship program of the HP Labs in Palo Alto, California. He was awarded PhD research fellowships from the HP Labs in 2001 and In collaboration with other students and faculty advisors, he has published more than 20 referred papers and developed more than 15 prototype systems. His PhD research focuses on developing a broker centric agent architecture to support pervasive context-aware systems in smart spaces.

172 UMBC an Honors University in Maryland 172 Annotated in OWL For more information


Download ppt "1 Ontologies for mobile and pervasive computing a tutorial offered at Mobiquitous 2004 Tim Finin and Harry Chen 22 August 2004"

Similar presentations


Ads by Google