Developing an Ontology-based Metadata Management System for Heterogeneous Clinical Databases By Quddus Chong Winter 2002.

Slides:



Advertisements
Similar presentations
1 Using Ontologies in Clinical Decision Support Applications Samson W. Tu Stanford Medical Informatics Stanford University.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Database System Concepts and Architecture
Dr Gordon Russell, Napier University Unit Data Dictionary 1 Data Dictionary Unit 5.3.
Chapter 2 Database Environment.
Introduction to Databases Transparencies
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
Methodology Conceptual Database Design
Lecture Nine Database Planning, Design, and Administration
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 1: Introduction.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Database Systems – Data Warehousing
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
1 Introduction to databases concepts CCIS – IS department Level 4.
Managing Data Resources
December 15, 2011 Use of Semantic Adapter in caCIS Architecture.
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Database System Concepts and Architecture
2. Database System Concepts and Architecture
1 Minggu 9, Pertemuan 17 Database Planning, Design, and Administration Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Introduction to Database Management. 1-2 Outline  Database characteristics  DBMS features  Architectures  Organizational roles.
Chapter 1 : Introduction §Purpose of Database Systems §View of Data §Data Models §Data Definition Language §Data Manipulation Language §Transaction Management.
Information System Development Courses Figure: ISD Course Structure.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Dimitrios Skoutas Alkis Simitsis
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
1 Chapter 1 Introduction to Databases Transparencies.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Web Technologies for Bioinformatics Ken Baclawski.
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  The concept of Data, Information and Knowledge  The fundamental terms:  Database and database system  Database.
Object storage and object interoperability
1 Chapter 2 Database Environment Pearson Education © 2009.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Welcome: To the fifth learning sequence “ Data Models “ Recap : In the previous learning sequence, we discussed The Database concepts. Present learning:
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Introduction to DBMS Purpose of Database Systems View of Data
Intro to MIS – MGS351 Databases and Data Warehouses
Data Warehouse.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Introduction to Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Database Systems Instructor Name: Lecture-3.
Introduction to DBMS Purpose of Database Systems View of Data
Database Design Hacettepe University
Metadata The metadata contains
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

Developing an Ontology-based Metadata Management System for Heterogeneous Clinical Databases By Quddus Chong Winter 2002

Outline  Towards a clinical data warehouse  Integrating heterogeneous data sources  Clinical abstractions as Ontologies  Managing database metadata  The data mediator approach  Using Protégé-2000

Towards a Clinical Data Warehouse  Clinical Data Warehousing is the application of Data Warehousing concepts to allow clinical data about a large patient population to be analyzed to perform clinical quality management and medical research.  In a data warehouse environment, data has the following properties: Data is organized by subject, or domain-level concepts, rather than by function. Data from various operational systems is integrated, by definition or by content. Data is archived in non-volatile storage to allow temporal analysis. Data is recorded with a temporal dimension (e.g. timestamp) Data is optimized for decision making (DSS) or analysis (OLAP).

Integrating Heterogeneous Data Sources  The main challenge in integrating data from heterogeneous sources is in resolving schema and data conflicts.  Approaches to this problem include using a federated database architecture, or providing a multi-database interface. These approaches are geared more towards providing query access to the data sources than towards supporting analysis.  Types of data integration: Physical integration – convert records from heterogeneous data sources into a common format (e.g. ‘.xml’). Logical integration – relate all data to a common process model (e.g. a medical service like ‘diagnose patient’ or ‘analyze outcomes’). Semantic integration – allow cross-reference and possibly inferencing of data with regards to a common metadata standard or ontology (e.g. HL7 RIM, OIL+DAML).

Clinical abstractions as Ontologies  An ontology is a explicit specification of the conceptualization of a domain. Information models (such as the HL7 RIM) and standardized vocabularies (such as UMLS) can be part of an ontology. An ontology provides a core component in a Knowledge-Based System.  In the clinical research field, ontologies have been used in computerized guideline modeling. This allows the development of applications to provide recommendations (e.g. to make indications for the use of surgical procedures), to identify deviations in practices, and screening services (e.g. evaluate patient eligibility).  Benefits of using ontologies include: Facilitating sharing between systems and reuse of knowledge Aiding new knowledge acquisition Improving the verification and validation of knowledge-based systems.

Managing database metadata  Metadata is the detailed description of the instance data; the format and characteristics of the populated instance data; instances and values dependent on the requirements/role of the metadata recipient.  Metadata is used in locating information, interpreting information, and integrating/transforming data.  Being able to maintain a well-organized and up-to-date collection of the organization’s metadata is a great step towards improving overall data quality and usage. However this task is complicated by the different quality and formats of metadata available (or not) from the heterogeneous data sources, and the consistency in updating existing metadata.  A common metadata architecture is essential to keeping data manageable.

The Data Mediator approach  In this project, we will attempt to develop an extensible and adaptable architecture to perform integration of heterogeneous data sources into a data warehouse environment using a ontology-based data mediator approach.  The components of this architecture include: Knowledge base – stores the ontology; consists of:  The abstraction model – domain-level concepts  The database description model – metadata record of data sources  The mappings model – how data elements relate to attributes in the abstraction model  The transformations model – metadata of available methods to transform data elements from one data source to another Data mediators – provides each data source an interface to the warehouse and resolving data conflicts between any different representations; necessary classes generated from the ontology. Data warehouse – provides access to integrated data for analysis and decision-making.

Patient model (adapted from SMI Dharma model)  The patient-data information model defines the classes and attributes of patient data for an Electronic Patient Record (EPR).  The patient-data model consists of: a Patient class whose instances hold demographic information about specific patients a Note_Entry class that describes qualitative observations about patients a Numeric_Entry class that represent results of quantitative measurements an Adverse_Event class that models adverse reactions to specific substances a Condition class that represent medical conditions that persist over time, and two intervention classes Medication and Procedure, that model drugs and other medical procedures that have been recommended, authorized, or used.  The defining characteristic of entities in the patient-data model is that they are assertions about demographic and clinical conditions of specific patients.

Database metadata model (adapted from Critchlow et. al.)  The metadata model here contains the information needed for the data integration process.  The database description model contains language independent class definitions that closely mirror the physical layout of a source database. In our prototype model, the database description is simply a class containing a set of database entries. A model is provided for two distinct entry- types: field-entries (from flat-file data sources) and column-entries (from relational data sources). Entries are essentially instances of the attribute class.  Modeling the database metadata as an ontology provides flexibility when trying to describe heterogeneous data sources. For instance, the model can be easily extended to describe Native XML databases.  How the models are used in data integration: The source database attributes are mapped to the appropriate abstraction characteristic through mappings. When an abstraction defines multiple representations for the same characteristic attribute, transformation functions are defined to convert between them.

A prototype architecture Source db 1 Target db Abstractions Data Descriptions Data Mappings Transformation Descriptions Ontology Server (Relational DBMS, e.g. MySQL) Source db 2 (Object-Relational DBMS, e.g. Postgresql) (Data Warehouse environment, e.g. SQL Server) Mediator Interface 1 Mediator Interface 2 Warehouse Mediator *possible use of JDBC metadata to obtain db descriptions *possible use of XSLT to perform data transformations *alternatively, a common metadata exchange standard such as XMI could be used *abstraction model in the ontology is extensible to any domain *XML data binding could be used to generate APIs for data validation or transformation *key goal: develop the ontology server as a component, use EJB or.NET *ontologies can be created and modified via Protégé-2000 tool; underlying format is RDF

Using Protégé-2000  Protégé-2000 is a experimental knowledge-acquisition tool, written in Java, that allows users to import, export and create their own ontologies.  The tool itself is extensible; a programming developer kit is available for instructions on creating plug-ins: ‘tabs’ - user interface between a ontology model in Protégé and another knowledge-based application. ‘slot-widget’ – user interface for viewing and acquiring slot values for new instances. backend plug-ins – specify the mechanism that Protégé-2000 will use to store the ontology.

Screenshot: Creating the classes and slots of an ontology

Screenshot: Viewing the newly created ontology model

References  Pedersen T. B., Jensen C. S., “Research Issues in Clinical Data Warehousing” In Proceedings of the 10 th International Conference on Scientific and Statistical Database Management, pg , July 1998 (available online: )  Critchlow T., Ganesh M., Musick R., “Meta-Data Based Mediator Generation” In Proceedings of the 3 rd IFCIS Conference on Cooperative Information Systems, August 1998 (available online: )  Tu S. et. al. “A Flexible Approach to Guideline Modeling” AMIA Annual Symposium, 1999 (available online: web.stanford.edu/pubs/SMI_Abstracts/SMI html ) web.stanford.edu/pubs/SMI_Abstracts/SMI html