Presentation is loading. Please wait.

Presentation is loading. Please wait.

March 15, 20061 Dr. Douglas B. Lenat, 3721 Executive Center Drive, Suite 100, Austin, TX 78731 Phone: (512) 342-4001 2 July 2005 C.

Similar presentations


Presentation on theme: "March 15, 20061 Dr. Douglas B. Lenat, 3721 Executive Center Drive, Suite 100, Austin, TX 78731 Phone: (512) 342-4001 2 July 2005 C."— Presentation transcript:

1 March 15, 20061 Dr. Douglas B. Lenat, 3721 Executive Center Drive, Suite 100, Austin, TX 78731 Email: Lenat@cyc.com Phone: (512) 342-4001 2 July 2005 C YC Custodian Communiqué Comments Upper Ontology Symposium

2 March 15, 20062 OpenCyc Open Source release of: the entire 300k-term Cyc Ontology + 1M Simple Relns. + 720 Inference Engines, NL Lex/Parser/Generator ResearchCyc All of Cyc publicly available free for R&D purposes

3 March 15, 20063 bits/bytes/streams/network… alphabet, special characters,… words, morphological variants,… syntactic meta-level markups (HTML) semantic meta-level markups (SGML, XML) content (logical representation of doc/page/...) context (common sense, recent utterances, and n dimensions of microtheory-space: time, space, level of granularity, the source’s purpose, etc.) What Needs to be Shared? Semantic Web

4 March 15, 20064 bits/bytes/streams/network… alphabet, special characters,… words, morphological variants,… syntactic meta-level markups (HTML) semantic meta-level markups (SGML, XML) content (logical representation of doc/page/...) context (common sense, recent utterances, and n dimensions of metadata: time, space, level of granularity, the source’s purpose, etc.) What Needs to be Shared? Tiny vocabulary (# distinctions) of standard relations: rdf:type, subclass, label, domain, range, comment,… Beyond which diversity is tolerated Which means divergence is inevitable “ What do you mean we have no standard, we have lots of standards!” DAML+OIL adds a few more distinctions: inverses, unambiguous properties, unique properties, lists, restrictions, cardinalities, pairwise disjoint lists, datatypes, … To do the logical/arithmetic combination across information sources, we need tens of thousands of relations, not tens

5 March 15, 20065 bits/bytes/streams/network… alphabet, special characters,… words, morphological variants,… syntactic meta-level markups (HTML) semantic meta-level markups (SGML, XML) content (logical representation of doc/page/...) context (common sense, recent utterances, and n dimensions of metadata: time, space, level of granularity, the source’s purpose, etc.) What Needs to be Shared?

6 March 15, 20066 bits/bytes/streams/network… alphabet, special characters,… words, morphological variants,… syntactic meta-level markups (HTML) semantic meta-level markups (SGML, XML) content (logical representation of doc/page/...) context (common sense, recent utterances, and n dimensions of formal ontological knowledge: time, space, level of granularity, the source’s purpose, etc.) What Needs to be Shared? I.e., share a formal ontology, including a full upper ontology, large portions of a middle ontology, and relevant slivers of a lower (domain-dependent) ontology.

7 March 15, 20067 Cyc: A Large Formal Ontology Thing Intangible Thing Intangible Thing Individual Temporal Thing Temporal Thing Spatial Thing Spatial Thing Partially Tangible Thing Partially Tangible Thing Paths Sets Relations Sets Relations Logic Math Logic Math Human Artifacts Human Artifacts Social Relations, Culture Social Relations, Culture Human Anatomy & Physiology Human Anatomy & Physiology Emotion Perception Belief Emotion Perception Belief Human Behavior & Actions Human Behavior & Actions Products Devices Products Devices Conceptual Works Conceptual Works Vehicles Buildings Weapons Vehicles Buildings Weapons Mechanical & Electrical Devices Mechanical & Electrical Devices Software Literature Works of Art Software Literature Works of Art Language Agent Organizations Agent Organizations Organizational Actions Organizational Actions Organizational Plans Organizational Plans Types of Organizations Types of Organizations Human Organizations Human Organizations Nations Governments Geo-Politics Nations Governments Geo-Politics Business, Military Organizations Business, Military Organizations Law Business & Commerce Business & Commerce Politics Warfare Politics Warfare Professions Occupations Professions Occupations Purchasing Shopping Purchasing Shopping Travel Communication Travel Communication Transportation & Logistics Transportation & Logistics Social Activities Social Activities Everyday Living Everyday Living Sports Recreation Entertainment Sports Recreation Entertainment Artifacts Movement State Change Dynamics State Change Dynamics Materials Parts Statics Materials Parts Statics Physical Agents Physical Agents Borders Geometry Borders Geometry Events Scripts Events Scripts Spatial Paths Spatial Paths Actors Actions Actors Actions Plans Goals Plans Goals Time Agents Space Physical Objects Physical Objects Human Beings Human Beings Organ- ization Organ- ization Human Activities Human Activities Living Things Living Things Social Behavior Social Behavior Life Forms Life Forms Animals Plants Ecology Natural Geography Natural Geography Earth & Solar System Earth & Solar System Political Geography Political Geography Weather General Knowledge about Various Domains Cyc contains: 15,000Predicates 300,000Concepts 3,200,000Assertions Represented in: First Order Logic Higher Order Logic Context Logic Micro-theories Specific data, facts, and observations

8 March 15, 20068 Cyc is… –The typical bird has 1 beak, 1 heart, lots of feathers,… –Hearts are internal organs; feathers are external protrusions –Most vehicles are steered by an awake, sane, adult,… human –Tangible objects can’t be in 2 (disjoint) places at once –Badly injuring a child is much worse than killing a dog –Causes temporally precede (i.e., start before) their effects –A stabbing requires 2 cotemporal and proximate actors – etc. Millions of facts, rules of thumb, etc. that capture human common sense about our everyday world

9 March 15, 20069 -Each of these represented in formal logic -Info. about a set of hundreds of thousands of terms -Language-independent Cyc is… Millions of facts, rules of thumb, etc. that capture human common sense about our everyday world

10 March 15, 200610 -Each of these represented in formal logic -Info. about a set of hundreds of thousands of terms -Language-independent Penitentiary EnglishWord-Plume EnglishWord-Pen FrenchWord-Plume … WritingPen BirdFeather … Authoring ArabicWord-Qalam Cyc is… Millions of facts, rules of thumb, etc. that capture human common sense about our everyday world Corral

11 March 15, 200611 Cyc Reasoning Modules Reasoning Modules Interface to External Data Sources Cyc API Knowledge Entry Tools User Interface (with Natural Language Dialog) Data Bases Web Pages Text Sources Other KBs Other Applications Other Applications Knowledge Authors Knowledge Authors Knowledge Users Knowledge Users External Data Sources External Data Sources Cyc Ontology & Knowledge Base

12 March 15, 200612 Lexical Entry Example: Eat ( verbSemTrans Eat-TheWord 0 TransitiveNPCompFrame (and (isa :ACTION EatingEvent) (performedBy :ACTION :SUBJECT) (inputsDestroyed :ACTION :OBJECT))) Constant: Eat-TheWord isa: EnglishWord Mt: EnglishMt infinitive: “eat” pastTense: “ate” perfect: “eaten” agentive-Sg: “eater” (subcatFrame Eat-TheWord Verb 0 TransitiveNPCompFrame)

13 March 15, 200613 Constant : Coke-TheWord isa : EnglishWord Mt : EnglishMt singular : “coke”pnSingular : “Coke” massNumber : “coke”pnMassNumber : “Coke” (denotation Coke-TheWord ProperCountNoun 0 (ServingFn CocaCola)) (denotation Coke-TheWord ProperMassNoun 0 CocaCola) (denotation Coke-TheWord MassNoun 0 Cocaine-Powder) (denotation Coke-TheWord MassNoun 2 ColaSoftDrink) (denotation Coke-TheWord SimpleNoun 0 (ServingFn ColaSoftDrink) Lexical Entry Example: Coke SLANG

14 Cyc NL Lexicon English Words18,737 Syntactic Frame Links13,922 Single-word Denotation Mappings25,999 Multi-word Phrase Denotation Mappings43,508 Verbal Semantic Frame Links3,517 Noun Semantic Frame Links2,396 Pragmatic Assertions1,232 Names ( Includes chemical symbols, person/place/organizatioin names, acronyms, etc.) 171,093 Predicate-based Phrasal Links ( genTemplates for paraphrase) 10,327

15 March 15, 200615 OpenCyc Open Source release of: the entire 300k-term Cyc Ontology + 1M Simple Relns. + 720 Inference Engines, NL Lex/Parser/Generator ResearchCyc All of Cyc (free, for R&D purposes) FACTory Free online “match game” to check/add to the ontology and, more generally, to the Cyc KB

16 March 15, 200616 The OpenCyc Release Runs on Windows, Linux OpenCyc Knowledge Base –LGPL license –47,000 terms  300k –306,000 facts  1M+ Cyc Inference Engine –Free license for binary runtime engine Application Programming Interface –Java, SubL, Python Extensive documentation –Ontological Engineer’s Handbook –Online Cyc 101 course –Much more

17 March 15, 200617 OpenCyc Popularity Downloaded over 90,000 times Used for teaching university AI courses IEEE candidate for standard upper ontology Cited in numerous books/publications OpenCyc Users group (monthly meetings)

18 March 15, 200618 Support for Modular Use CycL – documented and open-sourced Ontology - selected regions of microtheory space –Exportable as OWL, XML Export –Tuples exportable to OWL –Query results exportable in user-defined XML formats Inference Engine –Support HL module creation –CycL evaluatable functions using external Web services –Cyc API accessible as Web services (in progress)

19 March 15, 200619 Integrating OpenCyc a server component in an integrated system provides a rich open-source API (Application Programming Interface) with Java bindings a FIPA-OS and DARPA CoABS Grid compatible agent The Cyc API is exposed as a Web Service Implements DQL (DAML Query Language) service to support the Semantic Web

20 March 15, 200620 73 Active ResearchCyc User Groups (approx. 500 active ResearchCyc users) Xerox PARC Daxtron Labs Lockheed Martin ATLD Government Government-related Commercial Houston VA Medical Center Air Force Rome Labs Institute for the Study Of Accelerating Change U of Maryland Language Computer Corporation NTT Communications Science Laboratories (Japan) Northwestern U Stanford NLP Dept. ANSER, Inc. LBJ School of Public Affairs Fraunhofer Institute U of Illinois Urbana-Champaign New Mexico Highlands Univ. Harvard U Linkoping U (Sweden) Radboud U (Netherlands) Tokyo Inst. of Technology Terra Incognita University Microfabrica, Inc. U of Stuttgart NPOs MIT Media Lab Witan International U of Pennsylvania SRI 21 st Century Technologies U of Minnesota Stone’s Throw Technologies ISI Trimtab Consulting U of Hawaii Rensselaer AI and Reasoning Lab TNO-DMV (Netherlands) Sapio Systems (Denmark) U of Toronto Knowledge Media Institute, Open University Austin Info Systems

21 March 15, 200621 Applying ResearchCyc Identify as large and representative a set of task queries/challenge problems as possible Write additional assertions. (Add new terms to the ontology as needed. In rare cases, add new specialized reasoner(s)). Map the relevant information sources’ schemas (used in this task) to Cyc

22 March 15, 200622 A few sample ResearchCyc Projects AIML interface - requested by other ResearchCyc users Oracle PL-SQL interface C/C++, PHP APIs Business Process Execution Language (BPEL) extended with CycL Auto-generated user interfaces “MUD” Role-playing game Knowledge collection applications MIT Open Mind Common Sense Project, Phase 2

23 March 15, 200623 Query Formulator Geospatial Knowledge Source Integrator What are all structures in the suburbs of Tikrit with walls impenetrable by small arms fire? Basic Encyc NIMA N-P CDE Terrain Data Satellite Data Weather Data USGS Sensor Data Fused DB    and Retrieval GSK Transformation, Fusion Comprehensive Geospatial Knowledge Base

24 March 15, 200624 Geospatial Classes 1100 Atomic types, 338 functionally specified ones CrabFishery LakeBed MonsoonForest MudFlat USCS-Code-CL Glacier Ridge Butte Cave MinedArea PostalCodeRegion Prefecture TownSquare Quarry Atoll Continent TrueContinent (FieldFn OliveTree) (CityInCountryFn Cuba) Protectorate IndependentCountry Colony SchoolDistrict Monarchy 

25 March 15, 200625 Predicates of Geospatial Entities Over 500 terrainType maximumDepth cloudCeiling importsFrom regionalPastimes populationDensity trafficableForVehicle freightRailTrafficRate internetCountryCode hasClimateType languagesSpokenHere highestPointInRegion waterAreaOfRegion canopyClosureOfRegion

26 March 15, 200626 Construct a Comprehensive GS Ontology Spanning the following Areas (Geo)Topography : metric and geometric concepts salient to describing regions of the earth’s surface (e.g. measurement of lengths along, and angles between, great circles and rhumb lines; geodetic models of the earth, their reference ellipsoids, and coordinate transformations between them; map projections and coordinate transformations between map projections). Mereotopology : concepts such as spatial part of, overlapping, and connected to; applicable to any space, or things located in space, according to very general senses of these terms (e.g. where a space is a topology, in the mathematical sense). (Geo)Cartography : concepts used to define natural or conventional boundaries of earth surface regions, such as continent, desert, prefecture, dam, highway, or nation. Also, the loosely connected cluster of attribute dimensions that are used by maps to characterize these, or arbitrary (e.g. regular polygon) spatial regions. For example: Weather attributes, trafficability attributes, degrees of soil fertility, and numbers or types of a region’s inhabitants.

27 March 15, 200627 Global Terrain Data I.A.5.N.d Semi permanently flooded tropical sclerophyllous leaved evergreen I.A.5.N.c Seasonally flooded tropical sclerophyllous leaved evergreen I.A.5.N.e Saturated tropical broad sclerophyllous leaved evergreen III.A.1.N.c. Sclerophyllous tropical or subtropical broad-leaved shrubland GSK Transformation and Fusion

28 March 15, 200628 Is this shire a forest? Global Terrain Data I.A.5.N.d Semi permanently flooded tropical sclerophyllous leaved evergreen I.A.5.N.c Seasonally flooded tropical sclerophyllous leaved evergreen I.A.5.N.e Saturated tropical broad sclerophyllous leaved evergreen III.A.1.N.c. Sclerophyllous tropical or subtropical broad-leaved shrubland GSK Transformation and Fusion

29 March 15, 200629 From the cartography ontology: Three of the regular polygon regions are forests. Three of the regular polygon regions are forests. The sum of a contiguous group of forests is itself a forest. The sum of a contiguous group of forests is itself a forest. GSK Transformation and Fusion From the topography ontology: The shire is a sub region of a group of forests The shire is a sub region of a group of forests From the mereotoplogy ontology: Every part of the shire is part of a forest. Every part of the shire is part of a forest. Therefore: The shire is a forest The shire is a forest Australian Agricultural Data

30 March 15, 200630 Advanced GS Question Answering 1Which elevated areas in sector CX provide good cover from aerial surveillance are nearest to trafficable roads in enemy territory? 2What concrete structures have been occupied by enemy troops in the last 48 hours? 3What are coordinates of radio broadcast facilities in region X capable of supporting megawatt broadcasts, and what are the coordinates of power sources near these facilities? 4List all structures with walls impenetrable by small arms fire in the suburbs of Tikrit. 5List each facility in Angola of a type that typically contains drums of fuel oil, and the term for that facility’s type in the predominant local language. 6List all two story buildings with 20,000 feet or more of floor space, with no back entrance, and whose front entrance faces the prevailing wind. 7List the locations of all underground bunkers adequate to contain all the members of a unit of type T.

31 March 15, 200631 Bottlenecks to Applying ResearchCyc Several (NLU, NLG, Learning by…, Faster and easier knowledge entering/vetting by novices, etc.) Many of these in turn have as a bottleneck: MAKING INFERENCE FASTER

32 March 15, 200632 characterize and harness systems that employ 6 types of “tricks” Reasoners that exploit limitations in the expressivity of the repr. language they operate over Domain-specific (incl. Context-specific) reasoners Statistical/Bayesian Reasoners Unsound reasoners (analogy, abduction,…) Meta-reasoners (tacticians) and Meta 2 (strategists) Parallel Processing, Special-purpose hardware acceleration, Biotech, Nanotech, Quantum comp…

33 March 15, 200633 bits/bytes/streams/network… alphabet, special characters,… words, morphological variants,… syntactic meta-level markups (HTML) semantic meta-level markups (SGML, XML) content (logical representation of doc/page/...) context (common sense, recent utterances, and n dimensions of formal ontological knowledge: time, space, level of granularity, the source’s purpose, etc.) What Needs to be Shared? I.e., share a formal ontology, including a full upper ontology, large portions of a middle ontology, and relevant slivers of a lower (domain-dependent) ontology.


Download ppt "March 15, 20061 Dr. Douglas B. Lenat, 3721 Executive Center Drive, Suite 100, Austin, TX 78731 Phone: (512) 342-4001 2 July 2005 C."

Similar presentations


Ads by Google