Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Topic: Identifying the Data Schema behind SNOMED CT Jon Patrick, Centre for Health Informatics Research & Development, University of Sydney Ming Zhang,

Similar presentations


Presentation on theme: "1 Topic: Identifying the Data Schema behind SNOMED CT Jon Patrick, Centre for Health Informatics Research & Development, University of Sydney Ming Zhang,"— Presentation transcript:

1 1 Topic: Identifying the Data Schema behind SNOMED CT Jon Patrick, Centre for Health Informatics Research & Development, University of Sydney Ming Zhang, Donna Truran National Centre for Classification in Health

2 2 Outline  Project description  Research methodology  Experiments and Results  Conclusion  Limitation  Recommendation for future work

3 3 Project Description  Project background SNOMED CT – The core content is stored in simple tables  Project Objective To discover the conceptual model of SNOMED CT by reverse engineering

4 4 Research methodology  Data preparation Transfer the SNOMED CT core content table into RDBMS, that is the Text file into MySQL  Ontology Structure Investigation Database querying -- Explicit characteristics Programming – Implicit characteristics  Data modelling Analysis of the different characteristics and features so as to generate the conceptual data model

5 5 Experiment and Result  Explicit Characteristics of the Ontology Original data over view Fully defined and primitive Relationship types Hierarchy structure Multiple inheritance Full structure  implicit Characteristics of the Ontology Classification principles Relationship patterns

6 6 Original Data model  3 data tables: Concepts: one clinical idea is recorded as an concept: Descriptions: one clinical idea could have more than one description in this table  Relationships: each row represents a relationship between two concepts

7 7 Fully defined and primitive concepts  Primitive: A concept is primitive if its defining characteristics are insufficient to define it – that is it has more content than indicated by its attributes and relationship, e.g. clinical finding  Fully defined concepts A concept is fully defined if its defining characteristics are sufficient  “sufficient” and “insufficient” are determined by SNOMED experts.  Currently 41244 (11%) concepts are fully defined

8 8 Relationship types  Relationships between two concepts  “laterality” is a “relationship type” According to the statistics there are 1.4 million records of relationships, There are 62 relationship types used currently to represent the relationships between two concepts.

9 9 Relationship types Time aspectAccess instrumentLateralityRevision status WAS AHas specimenInterpretsProcedure context Indirect deviceAfterMAY BE AAssociated with Measurement methodHas focusHas active ingredientDue to Specimen source identityApproachCausative agentSpecimen source topography Scale typeREPLACED BYSAME ASAssociated procedure Specimen source morphologyUsingAccessHas intent PropertyHas dose formProcedure siteAssociated finding Recipient categoryDirect devicePart ofDirect morphology Procedure morphologyFinding contextPriorityHas definitional manifestation Specimen substanceProcedure site - DirectMethodOccurrence Pathological processHas interpretationAssociated morphologyComponent Procedure deviceDirect substanceEpisodicityOnset Indirect morphologyProcedure site - IndirectSeverityIs a Specimen procedureTemporal contextCourse MOVED TOSubject relationship contextFinding site

10 10 Hierarchy structure  In the collection of relationship types, “IS_A” represents the hierarchal relationship.  485,335 records in relationships tables are stored in the hierarchal information of SNOMED CT  The main hierarchal features root level(no parents): one root “SNOMED CT CONCEPT” middle node level (have parents and children): 80895 (22%) concepts 25687 nodes have only 1 child leaf node level (no children) 285283 (78%) concepts

11 11 Multiple inheritance  one concept in SNOMED CT may have many children and many parents

12 12 Multiple inheritance

13 13 Hierarchy structure - example Root Middle Nodes leaf Multiple parents

14 14 Full structure

15 15 Experiment and Result  Explicit Characteristics of the Ontology Original data over view Fully defined and primitive Relationship types Hierarchy structure Multiple inheritance Fully structure  Implicit Characteristics of the Ontology Classification principle Relationship patterns

16 16 Classification principle Top level categories: 18 direct children of root Each concept belongs to only one top level category So all concepts in SNOMED CT can be divided into 18 groups

17 17 Implicit Top level categoryNumber Of concepts Physical force200 Specimen1044 Staging and scales1108 Linkage concept1129 Events1642 Environments and geographical locations1666 Physical object4355 Social context5188 Context-dependent categories6836 Observable entity7568 Qualifier value8266 Pharmaceutical / biologic product19639 Substance23022 Organism26134 Body structure31760 Procedure52741 Special concept62014 Clinical finding111866

18 18 Relationship patterns The specific relationship type between any two Top categories

19 19 Relationship patterns  Pattern: {C1,type,C2} C1 is the one of 18 top categories type is the one of 62 relationship types C2 is the one of 18 top categories There are 18x62x18 = 20088 possible patterns  Each record in 1.4 million relationships records match one pattern.  To avoid ambiguity, the scope of this study covers only is “active” concepts  The results show only 78 patterns have instance in relationship table.

20 20 Data modelling based on patterns For example: to find the relationship between “clinical finding” and other top categories. Clinical finding (finding)Causative agent (attribute)Pharmaceutical / biologic product (product) Clinical finding (finding)Course (attribute)Qualifier value (qualifier value) Clinical finding (finding)Due to (attribute)Clinical finding (finding) Episodicity (attribute)Qualifier value (qualifier value) Clinical finding (finding)Finding site (attribute)Body structure (body structure) ………………….. Clinical finding (finding) Has definitional manifestation (attribute)Clinical finding (finding)

21 21 Conceptual Data Model

22 22 Future Work  Design a methods of defining real-world constraints over the relationships E.g. suicide can have slow onset  Develop storage and maintenance procedures for managing the data, e.g. there is no constraint over the data model as it exists at the moment.  Design a terminology server to deliver SCT to vendors.  Work with vendors to define a transport mechanism for vendors to be able to install SCT.  Create Internet access to SCT content for ad hoc users.  Start working on systems that demonstrate the value of SCT for clinical and administrative work.


Download ppt "1 Topic: Identifying the Data Schema behind SNOMED CT Jon Patrick, Centre for Health Informatics Research & Development, University of Sydney Ming Zhang,"

Similar presentations


Ads by Google