Data Model vs. Ontology Dr. Tatiana Malyuta Associate Professor, CUNY Consultant for DoD Dr. Barry Smith UB, NCOR.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Chapter 10: Designing Databases
C6 Databases.
Copyright Irwin/McGraw-Hill Data Modeling Prepared by Kevin C. Dittman for Systems Analysis & Design Methods 4ed by J. L. Whitten & L. D. Bentley.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Ontology From Wikipedia, the free encyclopedia In philosophy, ontology (from the Greek oν, genitive oντος: of being (part. of εiναι: to be) and –λογία:
Chapter 10 THINKING IN OBJECTS 1 Object Oriented programming Instructor: Dr. Essam H. Houssein.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
File Systems and Databases
Ch1: File Systems and Databases Hachim Haddouti
Methodology Conceptual Database Design
Object-Oriented Methods: Database Technology An introduction.
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
UML Class Diagrams: Basic Concepts. Objects –The purpose of class modeling is to describe objects. –An object is a concept, abstraction or thing that.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
IST Databases and DBMSs Todd S. Bacastow January 2005.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
SWE 316: Software Design and Architecture – Dr. Khalid Aljasser Objectives Lecture 11 : Frameworks SWE 316: Software Design and Architecture  To understand.
Introduction to Ontology Barry Smith August 11, 2012.
11:00 Self-Introductions 11:15 Report on ontology-based data integration work in DCGS-A --- Goals and methodology --- Practical experience and results.
Chapter 1 Object-Oriented Analysis and Design. Disclaimer Slides come from a variety of sources: –Craig Larman-developed slides; author of this classic.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
11 Chapter 11 Object-Oriented Databases Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Horizontal Integration of Warfighter Intelligence Data A Shared Semantic Resource for the Intelligence Community Barry Smith, University at Buffalo, NY,
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Best Practices in Higher Education Student Data Warehousing Forum Northwestern University October 21-22, 2003 FIRST QUESTIONS Emily Thomas Stony Brook.
S&I Integration with NIEM (DRAFT) Standards Development Support June 8, 2011.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Chapter 13 Designing Databases Systems Analysis and Design Kendall & Kendall Sixth Edition.
Semantic Enhancement vs. Integration Data-Model DSC Solution
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Chapter 10. The Explorer System in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans On, Kyoung-Woon Biointelligence Laboratory.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Information Artifact Ontology: General Background Barry Smith 1.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
IoT Meets Big Data Standardization Considerations
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
Be.wi-ol.de User-friendly ontology design Nikolai Dahlem Universität Oldenburg.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
XML 2002 Annotation Management in an XML CMS A Case Study.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Managing Data Resources File Organization and databases for business information systems.
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
Databases and DBMSs Todd S. Bacastow January
COMP6215 Semantic Web Technologies
improve the efficiency, collaborative potential, and
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
MANAGING DATA RESOURCES
File Systems and Databases
Ontology Reuse In MBSE Henson Graves Abstract January 2011
Database Systems Instructor Name: Lecture-3.
Ontology-Based Approaches to Data Integration
The ultimate in data organization
Database Dr. Roueida Mohammed.
Presentation transcript:

Data Model vs. Ontology Dr. Tatiana Malyuta Associate Professor, CUNY Consultant for DoD Dr. Barry Smith UB, NCOR

Data Model - Purpose To provide a consistent and efficiently functioning data store for a particular business application(s) – Represents specific business concepts in a way that determines organization of data in the store – Commonly used representations are relational and graph; they are supported by data management technologies, e.g. relational – Oracle and MySQL, graph – Neoj4, RDF/OWL stores. Efficiency requires – Application-specific representations – Store only data needed the application Objective (shared) representation of the domain is not the purpose – multiple data models for the same domain to accommodate different business applications

Data Silos Numerous partial idiosyncratic representations of the domain in data models and numerous versions of data in data stores No re-usability No single version of truth Accounts Receivable Accounts Payable Budget

Ontology – Purpose Objectivity of representation of reality Commonly used representation is graph, it is supported by RDF-based semantic technologies Objective (shared) representation of the domain - one authoritative ontology for the domain of reality meant for re-use Storing vast volumes of data is not the purpose

Financial Ontology A single domain ontology (or a collection of ontologies) To be re-used in different applications Single version of truth (as we know it today) Note: we discuss ontologies built in accordance with the methodology and architecture pioneered by Dr. Smith.

Comparison Although there are technologies that support a particular paradigm in the best way, they are not the defining factor in distinguishing between a data model and ontology We compare not technologies but paradigms Ontology Data Model

Data Model – Types Types are general or repeatable entities capable of being instantiated by indefinitely many particulars Data model types and instances are abstractions embodying efficient ways of describing the data about reality that is needed by an application (efficient both for reasoning and for storage) – Different abstractions depending on the business need The data model term ‘person’ is used to define an efficient storage solution for data about persons needed by a particular application

Ontology – Types Ontology types and instances are on the side of reality They must provide one term, and one definition, for each salient type of entity in each domain of interest The ontology term ‘person’, when it is used to represent data about persons, is designed to establish a link between these data and persons in reality.

Data Model – Organization Arbitrary combination of selected types suited for efficient data processing The data model view of reality is flat and rigid One of the models needs to be changed to accommodate multiple skills of a person. These changes can be performed only through significant effort because of relative rigidity of data representation languages and the need to re-arrange the physical data store

Ontology - Organization Each type appears only once in the ontology hierarchy. The ontology view of reality is synoptic – it represents in non-redundant fashion an entire hierarchy of types at different levels of generality. Each term is associated in an intelligible way with its subsuming and subsumed terms (and thus with the ancestor and descendant types) in the hierarchy of more and less general Representation is more flexible, changes are easier to make, and changes are not as disruptive

Questions?

Data Model vs. Ontology –Types and Individuals Person NameSkill JohnComputer Skill MarySewing Skill Skill Computer Skill Programming Skill Java C++ Person NameSkill JohnJava MaryC++

Data Model – Labels Are not as important because databases are not directly exposed to users – they are presented via an application that exposes the database content using the specific vocabulary of a narrow community of users Can be anything, e.g. ‘PN’, ‘PName’, ‘PersName’, ‘PersonN’, etc. for the person name The meaning of the label is often derived from the context (e.g. Name for the name of the Person and the name of the Skill in one of the examples)

Ontology - Labels Are exposed to users Are nouns and noun phrases from natural language, and each type has a unique name that designates the type unambiguously regardless of the context in which the type might be used, e.g. PersonName, SkillName

Closed and Open World Assumptions (impact of technologies) Database reasoning is confined to search based on the closed world assumption. If we do not find something in the database, then this means that this something does not exist in the world that is defined by the database. Ontologies are based on the idea that we can never describe entities in the real world completely. This means that, from the absence in an ontology of a particular term ‘A’, we cannot infer that As do not exist. It means also that ontologies are constructed in a way which allows easy addition of new types and relations.

Life Span Data models are created in ad hoc ways to capture targeted selection of features; the data model usually is not reused, which results in numerous data silos for a domain Ontologies will grow and expand as new knowledge is gained over time

Summary of Comparison Dimension of Comparison Traditional Data-ModelOntologies Closeness to reality Variable, application-specificReality is always the prime focus Conceptualization of the domain Plain and partial (always at the level of detail needed for a particular implementation) Hierarchical, simultaneously describing the same domain at different levels of detail VocabularyApplication-specific, not intended for sharing Application-independent, intended to support sharing and reuse Structures or organization of types Groupings of types to accommodate data access patterns Taxonomies (type hierarchies) always used to describe/classify the domain CombinabilityCan rarely be combined; even if possible this will typically require significant manual effort If the ontology building methodology is followed, then the results will be combinable automatically FlexibilityRigid, changes normally require significant effort Flexible, changes can normally be effected very easily.

Semantic Enhancement of Data Models by Ontology Semantic Enhancement (SE) is realized with the help of ontologies that are used to explicate data models and annotate data instances – Vocabulary of ontologies used for explications and annotations provides agile horizontal integration – Ontologies, by virtue of their nature and organization, provide semantic enhancement of data PersonIDNameDescription 111JavaProgramming 222SQLDatabase SQLJavaC++ ProgrammingSkill ComputerSkill Skill Education Technical Education 18

The Meaning of ‘Enhancement’ Semantic enhancement/enrichment of data = arm’s length approach (no change to data) – through simple explication we associate an entire knowledge system with a database field – enables analytics to process data, e.g. about computer skills, “vertically” along the Skill hierarchy, as well as “horizontally” via relations between Skill and Education. – and further… while data in the database does not change, its analysis can be richer and richer as our understanding of the reality changes For this richness to be leveraged by different communities, persons, and applications it needs to have the properties mentioned above and be constructed in accordance with the principles of the SE (see References) 19

SE and Data Integration Traditional integration approaches involve creation of a new model used in – A new physical store (data warehouse) Expensive, resource- and time-consuming Another data store – rigid (potential data silo), interoperable with other stores Querying the data sources via it – Fragile Both entail loss and or distortion of data and semantics, and provide only ‘local’ integration (do not lead to interoperability with other sources) SE of a store – Does not require data reorganization and creation of another store – Changes to it are non-intrusive – Leads to integration of the store with other stores, enhanced previously or in the future

References Barry Smith, et al. IAO-Intel – An Ontology of Information Artifacts in the Intelligence Domain, STIDS Conference, 2013.IAO-Intel – An Ontology of Information Artifacts in the Intelligence Domain Barry Smith, Tatiana Malyuta, William S. Mandrick, Chia Fu, Kesny Parent, Milan Patel, Horizontal Integration of Warfighter Intelligence Data: A Shared Semantic Resource for the Intelligence Community, STIDS Conference, 2012.Horizontal Integration of Warfighter Intelligence Data: A Shared Semantic Resource for the Intelligence Community Barry Smith, Tatiana Malyuta, David Salmen, William Mandrick, Kesny Parent, Shouvik Bardhan, Jamie Johnson, “Ontology for the Intelligence Analyst”, Crosstalk: The Journal of Defense Software Engineering, David Salmen, Tatiana Malyuta, Alan Hansen, Shaun Cronen, Barry Smith, Integration of Intelligence Data through Semantic Enhancement, STIDS Conference, Integration of Intelligence Data through Semantic Enhancement 21

Questions?