Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise.

Similar presentations


Presentation on theme: "1 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise."— Presentation transcript:

1 1 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise Adam Pease Articulate Software Presented at Ontolog 6 September 2007

2 2 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Overview SUMO is a large, open source, formal ontology stated in first-order logic Mapped to a large multi-lingual lexicon With open source tools for ontology development and application

3 3 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com What's New More content about social relationships, justice and law, military events-people- processes Wikipedia (DBpedia) links Updated mappings to WordNet 3.0 New tests of inference and many new inference engines SQL and XML generation tools Many new academic and commercial uses

4 4 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com SUMO Prize US$ Due December 1, 2007 Entries must be open source SUO-KIF files that extend SUMO Judged on several criteria: –Degree of formalization –Scope and coverage –Coherent new topic or domain –Actual utility in an application

5 5 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Pursuit of Rigor in Data Standards Old-style (most common) standards specifications: (ISO 14258, Requirements for enterprise-reference architectures and methodologies) Time representation If an individual element of the enterprise system has to be traced then properties of time need to be modeled to describe short-term changes. If the property time is introduced in terms of duration, it provides the base to do further analyses (e.g., process time). There are two kinds of behavior description relative to time: static and dynamic. Data-model standards (ISO , Product Description and Support) ENTITY product_context SUBTYPE OF (application_context_element); discipline_type : label; END_ENTITY; Semantic-model standards (IEEE P SUMO, ISO , PSL Core) (forall (?t1 ?t2 ?t3) (=> (and (before ?t1 ?t2) (before ?t2 ?t3)) (before ?t1 ?t3))) Thanks to Steve Ray, NIST

6 6 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com C.K. Ogden/I.A. Richards, The Meaning of Meaning A Study in the Influence of Language upon Thought and The Science of Symbolism London 1923, 10th edition 1969 Concept Referent Refers To Symbolizes Stands For Orange Terms and Concepts from the slide of [Bargmeyer, Bruce, Open Metadata Forum, Berlin, 2005] Slide adpated from (c) Key-Sun Choi for Pan Localization 2005 Term Ontology work should be here, since logic is needed to substitute for human thought. Lots of ontology work has really been here.

7 7 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Imagine...your view of the web CV name education work private Joe Smith BS Case Western Reserve, 1982 MS UC Davis, ACME Software, programmer Married, 2 children

8 8 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com...and the Computer's View name CV education work private Thanks to Frank van Harmelen for the original idea of this slide and Peter Yim for the Chinese language content

9 9 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com But wait, we've got XML -

10 10 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com But wait, we've got XML -

11 11 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com But wait, we've got Taxonomies - Person Mammal JoeSmith

12 12 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com But wait, we've got Taxonomies - o4839 x931 i3729

13 13 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Wait, we've got semantics - Person Mammal JoeSmith instance subclass implies Mammal JoeSmith instance

14 14 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Wait, we've got semantics - Person Mammal JoeSmith instance subclass implies Mammal JoeSmith instance u8475 x9834 p3489 r53 r22 implies x9834 p3489 r53

15 15 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Semantics Helps a Machine Appear Smart A smart machine should be able to make the same inferences we do (let's not debate the AI philosophy about whether it would actually be smart)

16 16 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Definitions An ontology is a shared conceptualization of a domain An ontology is a set of definitions in a formal language for terms describing the world

17 17 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Frames Object- or term-centered Frames, slots, values, (and attributes) Adam: Person height occupation 5'8 consultant cardinality: 1

18 18 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Frame Restrictions b is between a and c – (between1 a betweenness1) – (between2 b betweenness1) – (between3 c betweenness1) – vs – (between a b c) Adam is not an accountant – (notOccupation Adam Accountant) – vs – (not (occupation Adam Accountant)) Existential vs. Universal quantification Similar problems for many description logics Very efficient computation however

19 19 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Digression: Implementation is Different from Representation Why lose meaning at design time just because of runtime issues? –We cant reason with English definitions, but that doesnt mean we shouldnt document our terms Many different implementations may be done from the same representation This does not mean that run time issues should be ignored at design time –If you represent information you know cant be reasoned with, it better not be essential in most conceivable applications

20 20 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Many Ways to Use Ontology As an information engineering tool –Create a database schema –Map the schema to an upper ontology –Use the ontology as a set of reminders for additional information that should be included As more formal comments –Define an ontology that is used to create a DB or OO system –Use a theorem prover at design time to check for inconsistencies For taxonomic reasoning –Do limited run-time inference in Prolog, a description logic, or even Java For first order logical inference –Full-blown use of all the axioms at run time

21 21 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Upper Ontology An attempt to capture the most general and reusable terms and definitions Provokes thought on clarifying the meaning of more specific terms Provides for large-scale reuse

22 22 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Ontology vs Language and Knowledge Ontology - Expandable - language independent - machine understandable Language - understood by humans - ambiguous Knowledge - changes rapidly - may be local to an entity

23 23 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Suggested Upper Merged Ontology 1000 terms, 4000 axioms, 750 rules Mapped by hand to all of WordNet 1.6 then ported to 3.0 Development begun in 2000 –US Government small business grant Associated domain ontologies totalling 20,000 terms and 70,000 axioms Free SUMO is owned by IEEE but basically public domain Domain ontologies are released under GNU

24 24 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com SUMO (continued) Formally defined, not dependent on a particular implementation Open source toolset for browsing and inference –http://sigmakee.sourceforge.netsourceforge Many uses of SUMO (independent of the SUMO authors and funders) –http://www.ontologyportal.org/Pubs.htmlhttp://www.ontologyportal.org/Pubs.html

25 25 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com SUMO Validation Mapping to all of WordNet lexicon –A check on coverage and completeness (at a given level of generality) Peer review –Open source since its inception Formal validation with a theorem prover –Free of contradictions (within a generous time bound for search) Application to dozens of domain ontologies

26 26 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com WordNet Lexical database 100,000 word senses – synsets Created by George Miller's group at Princeton Free De facto standard in the linguistics world

27 27 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com WordNet to SUMO Mapping WordNet synset plant, flora, plant_life is equivalent to the formal SUMO term 'Plant' n 03 plant 0 flora 0 plant_life 0 | a living organism lacking the power of locomotion &%Plant= SUMO has axioms that explain formally what a plant is (=> (and (instance ?SUBSTANCE PlantSubstance) (instance ?PLANT Organism) (part ?SUBSTANCE ?PLANT)) (instance ?PLANT Plant))

28 28 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com WordNet to SUMO Mapping Most nouns map to classes Most verbs map to subclasses of &%Process Most adjectives map to a &%SubjectiveAssessmentAttribute Most adverbs map to relations of &%manner

29 29 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Internationalization Translation of SUMO paraphrases to diverse multiple languages –Some confidence theres no cultural or linguistic bias –Chinese, Hindi, Tagalog, Czech, German, Italian, Korean, Romanian, Arabic –Estonian and Hungarian in development SUMO is linked to multiple very large lexicons (Euro WordNet, Balkanet, HowNet etc) –English, Chinese, Italian, Arabic

30 30 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com SUMO Structure Structural Ontology Base Ontology Set/Class TheoryNumericTemporal Mereotopology GraphMeasureProcessesObjects Qualities

31 31 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com SUMO+Domain Ontology Structural Ontology Base Ontology Set/Class Theory NumericTemporal Mereotopology GraphMeasureProcessesObjects Qualities SUMO Mid-Level Military Geography Elements Terrorist Attack Types Communications People Transnational Issues Financial Ontology Terrorist EconomyNAICS Terrorist Attacks … France Afghanistan UnitedStates Distributed Computing Biological Viruses WMD ECommerce Services Government Transportation World Airports Total Terms Total Axioms Rules

32 32 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Are SUMO Terms Directly Usable? Yes. Study – 1/3 of upper ontology terms directly appear in answers on large test –Cohen, P., Chaudhri, V., Pease A., and Schrag, R. (1999), Does Prior Knowledge Facilitate the Development of Knowledge Based Systems, In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-1999). Menlo Park, Calif.: AAAI Press. cohen-aaai99.ps before (in time), agent (of a process), etc.

33 33 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com High Level Distinctions The first fundamental distinction is that between Physical (things which have a position in space/time) and Abstract (things which dont) Entity Physical Abstract

34 34 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com High Level Distinctions Partition of Physical into Objects and Processes Physical Object Process

35 35 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Objects Object SelfConnectedObject Substance CorpuscularObject Region Collection

36 36 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Processes DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining Separating InternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChange ShapeChange IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction Maneuver Motion BodyMotion DirectionChange Transfer Transportation Radiating

37 37 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Abstract SetOrClass Relation Proposition Quantity Number PhysicalQuantity Attribute Graph GraphElement

38 38 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Case Roles Roles that entities play in a Process –agent, patient, instrument etc.

39 39 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Case Roles Brutus stabbed Caesar with a knife on Tuesday. A Stabbing A Tuesday A Knife Brutus Caesar patien t agen t tim e instrumen t

40 40 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Case Roles Brutus stabbed Caesar with a knife on Tuesday. (exists (?S ?K ?T) (and (instance ?S Stabbing) (instance ?K Knife) (instance ?T Tuesday) (agent ?S Brutus) (patient ?S Caesar) (time ?S ?T) (instrument ?S ?K)))

41 41 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Attributes and Roles (attribute JohnDoe Unemployed) (attribute GIJane Soldier) (attribute MyCar Blue)

42 42 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Example Rules (=> (instance ?DRIVE Driving) (exists (?VEHICLE) (and (instance ?VEHICLE Vehicle) (patient ?DRIVE ?VEHICLE)))) If there's an instance of Driving, there's a Vehicle that participates in that action. Not just an English definition for humans to read, but a logical definition that can be used in proofs.

43 43 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Commercial Application One year project for Articulate Software Working with a company that creates financial transaction systems for royalty payments Re-engineer current ontology management business process, tools and ontology

44 44 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Commercial Application Extensive current ontology Captured in spreadsheets Local term names and definitions for every customer –An essential part of their process Ontology management system that exports XML & RDF One end-user database is nearly 3GB –Ontology functions can be batch-process

45 45 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Project Goals To add formality to existing model –To support full logical inference, consistency checks Give customers user-friendly ontology editor –so that they can maintain the ontology Create broader set of definitions –Enable greater DB integration –Enable expansion into new markets Leverage work Exercise SUMO and Sigma in business

46 46 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Initial Tasks Implement UI improvements to Sigma –Simplified tree-based editor –Simplified frame-style browser XML/SQL ontology export –Uses meta-predicates for physical DB structure Merge existing ontology with SUMO

47 47 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com DBPedia People content uses FOAF –Lightweight, redundant, ad-hoc –Only a tiny portion is used birthdate, deathdate, birthplace, deathplace, names, firstname, lastname –http://xmlns.com/foaf/spec/http://xmlns.com/foaf/spec/ –16MB KIF content Recent announcement of DBPedia now mapped to WordNet –Which gets us links to SUMO

48 48 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com TPTP Research effort in automated theorem proving 40+ different first order logic provers Annual competition Thousands of test problems We will issue SUMO-based tests in TPTP format next month Sigma connected to TPTP prover suite

49 49 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com Controlled English to Logic Translation Automated translation from English to Logic Uses WordNet-SUMO mappings for 100,000 word sense vocabulary Domain-independent Development process –Start with a highly restricted language and gradually add linguistic features


Download ppt "1 © 2007 Adam Pease, Articulate Software - apease [at] articulatesoftware [dot] com The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise."

Similar presentations


Ads by Google