Download presentation
Presentation is loading. Please wait.
Published byOswin Malone Modified over 9 years ago
1
Modular Ontology architecture for using human defined sets of concepts Presentation by OntologyStream Inc Paul Stephen Prueitt, PhD Ontology Tutorial 5, copyright, Paul S Prueitt 2005
2
The best example of an ontology is the set of positive integers Set of positive integers Mathematical models of natural systems Arrow of timeGeographical positions Instances in the world where the concepts of a counting number are essential Accounting Quantitative measurement
3
Set of positive integers Instances in the world where the concepts of a counting number are essential The concept of an integer is used without the specific use of a concept effecting the definition of the concept, of “two-ness” for example. The existence of this set of concepts allows a great diversity of human activities. The “ontology standard” is enforced by the correctness of the concepts and by the ease in which new applications can be found. The standard is ultra-stable and resilient because the concepts are correct. The standard is not owned by anyone.
4
Modular ontology is used to measure the properties of events with sets of concepts. processes Notation e(i) = w(i)/s(i) The measurement of an event has a weakly structured and a structure part { e(i) } { w(i) } { s(i) } Semantic extraction Discrete analysis Events occur in a real world as part of complex processes. Largely because events are seen as having patterns and structure, software engineers can build relational databases, or XML repositories to help us understand and interact with information that is situation specific. With ontology, human communities will be able to reveal a set of concepts, and define regular relationships between concepts. We call this “Ontology mediation of information flow”. The formal representations of the concepts are used to organize data and to move data from one place to another. This has to be demonstrated. We will illustrate Ontology mediation of information flow, as an example, in the development and use of Harmonized Trade Tariff Schedule Administrative Rulings. A HTS Administrative Ruling is a short public document that ties together a code used to determine duties on imported or exported commodities. A second example is suggested whereas Selectivity and Targeting reports are seen as measurement of selectivity and targeting events by Custom and Border Protection.
5
processes Semantic extraction A framework holds a higher level abstraction representing an analysis of how things follow each other. Example: event-Structure Ontology Framework (e-SOF) has 18 cells developed from the cross product of the three dimensions : {past,present,future}; {people,places,things}; {how,why} Example: risk/gains Ontology Framework (rg-OF) has 40 cells developed from the cross product of the three dimensions: {Risk, Gain}; {Anomaly, Trend}; { measurement/assessment, name/group, event/context, rule, policy/component, function/behavior } Ontology Framework
6
processes Explicit ontology such as OWL DL By aligning the internal (implicit set of concepts) in a semantic extraction computation with the explicit form of concept representation, provided by the OWL DL standard, one is able to organize information expressed as concepts in free form text. One is able to use look up tables, lists, controlled vocabularies and taxonomies to expand that statement of these conceptual expressions so that the expression is as clear, complete and consistent as possible. One is able to move the information from a single event into a computational space where specific structure is available to bring relevant information to the report development process. One is able to, after the fact, create a better report about an event, such as an administrative ruling or a selectivity and targeting action. One is able to develop long term trending and analogy detection using specific information about how things are related to each other in the real world. Ontology Framework
7
A modular ontology management infrastructure provides various services in the context of field reporting over transactions upper level ontologies “other” upper level ontology Law governing US Customs Advanced Trade Data Economic Supply Chain Data Findings ontologyEntities ontologyGain/Risk ontology sources of data Location ontology Later application areas HTS Ontology
8
Written reports Structural Event. Ontology Framework In our work, human knowledge is captured separately in two computer computable forms: implicit (semantic extraction ontology) and explicit OWL DL ontology Gain / Risk Ontology Framework.
9
{ who, where, what, how, why } x { past, present, future} Structural Event Ontology Framework The classical, existing from Greek times, six interrogatives is partitioned into three parts; {people, places, things} + { event structure with causality } + time { people, places, things } event structure 18 questions from frames (past, who, how), (past, who, why), (present, who, how) (present, who, why) (future, who, how) (future, who, why) Etc… event Structure Ontology Framework (e-SOF) ** ** e-SOF was “discovered” by Dr. Paul S. Prueitt while thinking about a US Customs ontology prototype in March 2005
10
Ontology Framework Ontology Reasoner Scoped Ontology Individuals Knowledge Management visualization Knowledge Engineer visualization By internally adjusting the rules within any one of the commercially available semantic extraction (implicit) ontology we measure text, or structured data in a single record, using a three element frame ( y, x, z) where x is from the set { people, places, things } where y is form the set { past, present, future } and where z is from the set { how, why } There are 3*3*2 = 18 of these three element frames, each which can be seen to ask a question. The measurement using linguistic and structural knowledge to answer those questions that can be answered. Those that are not answered are left blank. Other semantic extraction tools can be similarly manipulated to produce an alignment between internal ontology (not often OWL) and external OWL DL ontology (which is our standard).
11
High Risk Ontology Expression Bio-systems Weapon-systems Commodity history analysis Entry Reports and Findings { concepts } Ontology Framework with Differential Ontology Expressions informs aligns Ontology expression about the risks measured from historical analysis of commodities US Customs cultural viewpoints expressed as sets of concepts Shipping manifests Entity histories
12
High Risk Ontology Expression Bio-systems Weapon-systems Commodity history analysis Entry Reports and Findings { concepts } Rapid knowledge acquisition and reporting about a transaction Ontology expression about the risks measured from historical analysis of commodities US Customs cultural viewpoints expressed as sets of concepts A transaction: Nautilus Explorer (“Nautilus”) owns and operates the M/V NAUTILUS EXPLORER, a 116-foot Canadian-flagged long-range dive boat. Nautilus would like to embark passengers in San Diego, California, on two separate occasions, for three days of diving in Mexican waters before returning to San Diego. The passengers would be embarked and disembarked at the same location in San Diego. Semi-automated generation of Reports
13
We take the first two dimensions of a framework to be { Anomality, Trend } union { Gain, Risk } And the other dimension to be: { measurement, assessment, name, group, event, context, rule, policy, component, function/behavior } Then, in the cross product, we have four sets of ten concepts. In fact the ten concepts are five sets of two concepts – each with an interesting “oppositional scale type” relationship. { measurement, assessment, name, group, event, context, rule, policy, component, function/behavior } ** This Gain/Risk Ontology Framework was “discovered” by Dr Prueitt in March 2005 while thinking about possible US Customs Selectivity and Targeting enhancements. Dr Peter Stephenson and Dr Prueitt are extending this in the context of Cyber Security ontology mediation data analysis. gain/risk Ontology Framework (gf-OF) **
14
Semantic Extraction Link Analysis Pattern recognition Ontology Tools Statistics Advanced Trade Data Harmonized Tariff Schedule Detailed work with tools over available data Practical problem: Provide the three Cs, clarity, consistency, and completeness in EACH judicial review of a commodity in passage across national boarders. Integrated collection of reified ontologies with some specific inferences and some information organization and retrieval Possible deployment as U. S. Custom’s Total Information Awareness (TIA) capability
15
Data Transfer Object (SOI) Scoped Ontology Individual Transactions Findings Entry Entry Summary Script SOI pushes information Portal pulls information databases Script pulls information Ontology Individuals have a subsumption relationship to upper abstract ontologies Ontology Framework Ontology Reasoner Scoped Ontology Individuals Human machine interface Knowledge Management visualization Knowledge Engineer visualization client visualization An event
16
SOI design by-passes the critical “visualization” choke point Scoped Ontology Individuals Human machine interface SOI Stack of SOIs supporting analysis of analysis Ontology Framework Ontology reasoning The mental event is the model for the Scoped Ontology Individual (SOI). The SOI is a minimal formal ontology (defined in OWL DL) that binds the concepts and data together about a single event. The Framework’s small number of concepts organize the organization of everything that is known about the data elements that occur in a Harmonized Tariff Schedule administrative ruling. Once the data elements have been used as the initial conditions for SOI formation, additional SQL queries may be made, or additional ontology subsetting may be made so as to bring new information or information that was not initially known “into the visualized frame”.
17
Scoped Ontology Individuals Human machine interface SOI Stack of SOIs supporting analysis of analysis Ontology Framework Ontology reasoning Visualization of ontology: The concept of a Scoped Ontology Individual (SOI) opens up a visualization paradigm that has never been exposed before (it is an original concept that is based on decades of work in cognitive neuroscience) SOI design by-passes the critical “visualization” choke point that occurs when Ontology Systems are built on the relational data base model (as is done in our ontology augmentation of rule engines). This by-pass is created when data elements in a report is used to subset upper ontologies and domain ontologies to produce the minimal set of “concepts” needed to frame the data. If Framework Ontology is being used, then this subsetting process has an expansion / contraction cycle that produces very small SOI objects. (see previous slide)
18
Ontology Framework Ontology Reasoner Scoped Ontology Individuals Knowledge Management visualization Readware MITi Inc and InOrb Technologies have teamed to develop a demonstration capability based on the use of Readware internal ontology API to create text elements that populate the 18 cells of the e-SOF. We use the triple: ( y, x, z) where x is from the set { people, places, things } where y is form the set { past, present, future } and where z is from the set { how, why } This involves three steps: 1) Coding eight probes that use the internal Readware stem-based text understanding computations to find information and classify this information as answers to people, places, things, past, present, future, how or why questions. 2) There are some options, but the one we are investigating first is to use the People Places and Things probes first. This is a well know “Named entity extraction” approach. 3) Then when one of the these three probes “finds” something; then the local neighborhood (in the Readware stem structure) is examined to see if more of one or more of the 18 questions can be answered. Custom’s analyst
19
The other choke point is dependency on a relational database Data Transfer Object (SOI) Scoped Ontology Individual Transactions Findings Entry Entry Summary Script databases Script pulls information ILOG Rulebase Reasoner An event Ontology Augmentation of a rule based engine
20
For complex reasons, demonstration about how to use ontology have often used a fixed data set with doctored data to pretend as if scalability issues have been solved or are not relevant. These demonstrations fail far short of correctness and hid specific known weaknesses of classical IT architecture. The scalability issue comes from the need to extend ontology or XML, add delete or modify concepts. These extension requirements come from many different origins, different communities of practice, and as circumstances change. Extensibility is the key contribution that XML has brought. For example without a common data encoding paradigm, the scalability issue creates a second choke point. The relational database must have a fixed data schema. The work on such a solution is under the XML MetaData Repository standards process: http://hpcrd.lbl.gov/SDM/XMDR/arch/ XMDR, RDF or OWL DL may, or may not, solve this problem. Modular ontology helps, but the principles developed in differential ontology, formative ontology and Framework Ontology seem essential to solving the whole problem as completely as possible. With these approaches, we find by- passes to technology problems that are seen now by the XMDR standards committee as being unsolvable. The definition of a event specific Scoped Ontology Individual is one of those by-passes. On the relational database dependency
21
There are some existing software products, Convera, AeroText, MITi, Semagix, Autonomy, and others; were a common data encoding solution exists. A data encoding solution is generally protected by patents, and is used to provide computational efficiency; one of the best examples is PriMentia's Hilbert engine were a key-less hash table type data encoding allows contextual search in the most natural fashion. Autonomy has also the technology that Michael Lynch developed in the Autonomy spin-off N-Corp. Semagix, Applied technical Systems, and 15 or 20 others have excellent data encoding solutions. If an government agency selected the two or three best technologies, the communications between the internal representation would be required. This may or may not be easy, depending on the specific technologies. In Summary: These software products create an integration of classically understood methods using a common data encoding. Each COTS product uses a different internal data representation, and so the use of more than one COTS product will create binding issues. A modular ontology management architecture can be used to integrate technologies like semantic extraction and related knowledge discovery in data technology (implicit ontology) ontology development and editing (explicit ontology) advanced algorithms related to risk definition and decision support visualization technology
22
So government agencies really have two solution paths: 1)Choice one or two vendors after actually understanding what each vendor provides and create a complete solution with that tool set. The requires integration architecture. 2)Learn from a Trade Study process what the methods are that make COTS semantic extraction work, move around the patents and other IP; and develop a unique application that is specific to that government agency. In either case, the greater challenge is the technology transition challenge. If the technology is not a LOT better than the current beta sites and doctored demonstrations, then the transition effort will fail. But, leaving transition issues aside, let us look closer that these two options High level view of integration architecture
23
So we have two solution paths: 1)Choice one vendor after actually understanding what each vendor provides and create a complete solution with that tool set. But how to select? CoreSystem CoreOntology first takes on the underlying stability issue by moving forward a design time Iconic language that may revolutionize how society uses computers. Current generation best of bread technology The list of possible qualified candidates for offering a complete solution might be less than 20 companies. In many cases, these companies are highly capitalized and would provide stability for some period of time. However, the underlying XML and ontology standards are not stable. One would expect that better global solutions will exist within five years. So one needs to know that the sets of concepts can be exported and transformed as the market matures. Current generation may not solve all problems in an optimal fashion Next generation tools are no yet ready to produce systems
24
So we have two solution paths: 2) Learn from a Trade Study process what the methods are that make COTS semantic extraction work, move around the patents and other IP; and develop a unique application that is specific to Customs. These two diagrams are from OntologyStream Inc. There is no suggestion that this non-capitalized small company has the management skills required to build out an application specifically designed from the principles discussed by Prueitt and his colleagues. So we have sought the support and guidance from SAIC or IBM to bring a small team together to develop a government owned system based on these principles and at the smallest possible cost.
25
Summary Current contractors almost always treat ontology and XML technology as if the same as relational database technology. Current contractors are gaming the contracts so that maximum Time and Materials resources can be expended. Ontology and XML standards committees struggle with the issues of private intellectual property and hidden agendas. Ontology visualization by users is required to find optimal solutions consist with cultural expectations. Ontology and XML standards have not been able to address ontology visualization or process models that place Ontology and XML into complex work flow. A single payer entity is needed to bind together the best technology and to resolve IP and philosophical differences.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.