Presentation on theme: "An Introduction to SNOMED CT ® Using Release Format 2 1 Presented by Denise Downs with Ed Cheetham and Chris Morris A Technical Overview 16 th January."— Presentation transcript:
An Introduction to SNOMED CT ® Using Release Format 2 1 Presented by Denise Downs with Ed Cheetham and Chris Morris A Technical Overview 16 th January 11:00-12:00
Welcome Audience: Technical overview – for anyone needing to use the SNOMED CT release files Walks through the release packs and folders Illustrate how the basics of SNOMED CT data model are provided in the release files Does not require prior detailed knowledge of the previous release format – RF1, though assumes a basic understanding of SNOMED CT
The SNOMED CT release The SNOMED CT terminology is distributed via a set of release files. Those release files are provided in two different formats: –Release format 1 (RF1) –Release format 2 (RF2)
Release Formats Release Format 1 (RF1) –current primary data source for UK Edition (RF2 is created from RF1) Release Format 2 (RF2) –Primary data source for international release: International convert RF2 to RF1 to provide the international RF1 release (they plan to only provide RF2 at some time in the future) UK currently provide both RF1 and RF2; notice will be given when the UK intend to move to RF2 as its primary source These are Distribution Formats rather than implementation formats. It is expected that technical users of the terminology upload the data into their own implementation schema.
Why two formats? The core terminology provided by these release formats is essentially the same, ie. a concept and its concept relationships are the same in RF1 as in RF2; the fundamental clinical content and knowledge are the same. –RF1 and RF2 provide the same concepts, descriptions and relationships as each other. There are however some differences in additional features: for example timestamp and full history in RF2. RF2 was developed to address some of the issues and lack of flexibility to accommodate new features perceived within RF1.
This webex focuses on RF2 It will give an insight into the detail of the file content
UK Edition of SNOMED CT Consists of: –International Release –UK Clinical Extension –UK Drug Extension ( provides same concepts as dm+d with some additional items such as family name and relationships to the UK Clinical Edition ) The files for each are provided separately to accommodate different supplier needs and different release cycle schedules. Files of the same type of components need appending to each other for the UK Edition. The international release is provided along with the UK Clinical Extension. UK Clinical Edition UK Edition International Release UK Clinical Extension UK Drug Extension UK Clinical Edition UK Edition However, ideally need the Drug Extension for full UK descriptions
UK Edition of SNOMED CT UK Clinical Extension: Six monthly releases –April and October International Release –January and July but not implemented in UK until it is part of UK Edition in April and October (respectively) UK Release contains UK Mappings and other resources UK Drug Extension: 4 weekly, based on dm+d Every 6-months an additional UK Drug Extension that provides relationships to the latest International Release dm+d (weekly, xml) RF2 releases will be on TRUD in April and October from April 2014
UK Terminology Centre ‘Collection’ TRUD Main Page: http://www.uktcregistration.nss.cfh.nhs.uk/trud3/user/guest/group/0/home http://www.uktcregistration.nss.cfh.nhs.uk/trud3/user/guest/group/0/home ‘UK Terminology Centre Collection
UK Terminology Centre ‘Collection’ SNOMED CT release files: http://www.uktcregistration.nss.cfh.nhs.uk/trud3/user/guest/group/2/pack/26 ‘The UK Edition of SNOMED CT’ Download
Packs UK Clinical Edition, RF1 (UK Clinical Extension & International Release) UK Drug Extension, RF1 (UK Drug Extension only) UK Clinical Edition, RF2 - Full, Snapshot and Delta (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Full only (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Snapshot only (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Delta only (UK Clinical Extension & International Release) UK Drug Extension, RF2: Full, Snapshot and Delta (UK Drug Extension only) UK Drug Extension, RF2: Full only (UK Drug Extension only) UK Drug Extension, RF2: Snapshot only (UK Drug Extension only) UK Drug Extension, RF2: Delta only (UK Drug Extension only) UK Clinical Edition, RF1 (UK Clinical Extension & International Release) UK Drug Extension, RF1 (UK Drug Extension only) UK Clinical Edition, RF2 - Full, Snapshot and Delta (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Full only (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Snapshot only (UK Clinical Extension & International Release) UK Clinical Edition, RF2: Delta only (UK Clinical Extension & International Release) UK Drug Extension, RF2: Full, Snapshot and Delta (UK Drug Extension only) UK Drug Extension, RF2: Full only (UK Drug Extension only) UK Drug Extension, RF2: Snapshot only (UK Drug Extension only) UK Drug Extension, RF2: Delta only (UK Drug Extension only) Our aim was for users to only have to download one pack for clinical and one for drugs
RF2 file types The SNOMED CT release has three different release file types: Full Release: containing the complete history of every component Snapshot Release: containing the current state of every component Delta Release (containing only the additions and changes since the previous release
Full The filename will have _Full in the filename eg. SCT2_Description_Full.. contains every version of every component ever released Utilises a ‘log style’ audit approach on capturing change Hence has a row for each change of each component: –with an effectiveTime timestamp –Once issued a row does not change –To inactivate a component a new row is created with a timestamp and inactive status –To change a component a full new row is added containing the updated fields The UK Full includes releases of the UK Extension from 1 st April 2004 The International Release includes all releases from 31 st January 2002
Snapshot The filename will have _Snapshot in the filename eg. SCT2_Description_Snapshot.. A "Snapshot" release, contains only the most recent version of every component ever released (both active and inactive components) The snapshot can be derived from the Full
Delta The filename will have _Delta in the filename eg. SCT2_Description_Delta.. A "Delta" release, contains only component versions created since the last release. Each component version represents –a new component, or –a change in an existing component, or –the inactivation of a component Delta file needs combining with previous release files to give the full terminology
Full + Delta Full UK Clinical April 2013 Delta UK Clinical Oct 2013 Full UK Clinical Oct 2013
Snapshot April 2013 RF1 Snapshot April 2013 RF2 When filtered on active components only
How do you decide which you need? Depends on your requirements: Snapshot is in effect the ‘current’ state (similar to RF1) Full provides a full history and thus for things like analytics is probably required Delta – could take full and then add the delta to your application each release – our QA does include checks in this respect. You would need to be robust in ensuring you do not miss a delta See 7.2 of the IHTSDO Technical Implementation Guide (TIG) for more details
UK Baseline The UK Baseline is the first RF2 release. –It provides a consolidation of all UK previous releases of SNOMED CT from 1 st Jan 2004 in the release type: Full –It only has release types: Full and Snapshot for the UK Extensions –The UK baseline aligns with the April 2013 UK Release The UK Candidate Baseline (beta version for review) previously issued has now been replaced by the UK Baseline. If you have a previous copy of the Candidate Baseline this should not be used. The UK Baseline will now not change, any issues identified will be dealt with through the RF2 change mechanism. The UK Baseline does not have a set of delta files
Reference sets (refsets) Refsets is the RF2 mechanism to extend information related to core components, for example: –Subsets of SNOMED CT –Mapping tables eg. SNOMED CT to data dictionary codes, SNOMED CT to ICD-10 –Historical relationships such as Had_VMP
Refsets The folder structure reflects the way have released our RF1 subsets so that users can find the ones they require Within a folder, all refsets of the same pattern are released within a single file NB. The Resources folder has file which for each RF1 subset identifies its RF2 refset equivalence
Filename formats All the filename formats conform to a standard specification, this enables the production of a load script for each release.
Documentation UKTC aims to add to NOT duplicate IHTSDO documentation but to supplement. You therefore need the TIG! (on the web) Various UKTC materials: http://systems.hscic.gov.uk/data/uktc/snomed/training http://systems.hscic.gov.uk/data/uktc/snomed/training Index to documentation: SNOMED CT Documentation Catalog SNOMED CT Documentation Catalog
Documentation Folder (anything that is doc1 is the same as the RF1 release) Inventory of Documentation ‘Current’ – provides the nuances of the RF2 Release Should look at the release notes for both UK and international for full overview (NB. the UK Baseline contains only the ‘Current’ document)
UK Resources Provides an table linking each subset in the UK Edition RF1 to its equivalent refset in the UK Edition RF2 release The names of the refset are not the same as those of the subset as refsets now have components in the metadata hierarchy and thus must conform to editorial naming principles
A Concept in SNOMED CT Heart disease (disorder) Relationships Is a cardiac finding Is a disorder of mediastinum Is a disorder of cardiovascul ar system Is a cardiac finding Is a disorder of mediastinum Is a disorder of cardiovascul ar system Finding site heart structure Severity Episodicity Courses Severity Episodicity Courses Description: Heart Disease Description: Cardiopathy Description: Disorder of Heart Description: Morbus Cordis Description: Cardiac Disorder Description: Cardiac Diseases Description: Heart Diseases FSN: Fully specified name Concept Id: 56265001
High level Schema for SNOMED CT core Concepts Descriptions 1 concept has many descriptions (min 2) FSNSynonyms Relationships 1 concept has many relationships Relationship form: Source | type | destination Very Simplified: See TIG for full details Concepts
What’s where? All concepts with their id, effectiveTime, status and definitionStatus(primitive or fully defined) are in the concepts file ALL descriptions (FSN and descriptions) are in the descriptions file All relationships (IS_A, finding site etc) are in the relationships file NB. All content is captured as a concept. For example: a definitionStatus of ‘primitive’ for a concept in the concepts table is recorded as 900000000000074008. Table 47 in the TIG provides a list of these and their respective conceptIds.
Example – Concepts File id effectiveTimeactivemoduleId definitionStatusId 999001681000000107201310011999000011000000103 900000000000074008 999001691000000109201310011999000011000000103 900000000000074008 999001701000000109201310011999000011000000103 900000000000074008 999001711000000106201310011999000011000000103 900000000000074008 999001721000000100201310011999000011000000103 900000000000074008 999000011000000103200401311999000011000000103 900000000000074008 Section 5.5.3 gives the details of the file format for each of the core files: concepts, descriptions and relationships Concept id A SctId, indicates the Module the concept version is in A SctId, the concept is primitive SNOMED CT United Kingdom clinical extension module (core metadata concept)
Example – descriptions file id effectiveTime activemoduleIdconceptIdlanguageCodetypeIdterm caseSignificanceId 1292371000000116201004011999000011000000103582261000000107en 900000000000003001Operation on aneurysm of celiac artery NEC (procedure) 900000000000020002 1292381000000119201004011999000011000000103582261000000107en 900000000000013009Operation on aneurysm of coeliac artery NEC 900000000000020002 1292391000000117201004011999000011000000103582271000000100en 900000000000013009Other nonteratogenic anomaly NOS900000000000020002 1292401000000119201004011999000011000000103582271000000100en 900000000000003001Other nonteratogenic anomaly NOS (disorder) 900000000000020002 129241000000119200401311999000011000000103388681002en 900000000000013009Tomato RAST test900000000000017005 1292411000000117201004011999000011000000103582281000000103en 900000000000003001Other musculoskeletal deformity (disorder) 900000000000020002 1292421000000111201004011999000011000000103582281000000103en 900000000000013009Other musculoskeletal deformity900000000000020002 900000000000020002 | only initial character case insensitive | 900000000000003001 | fully specified name |
Preferred Descriptions As each concept can have more than one description, the UK Edition provides mechanisms to identify UK recommended descriptions: The Realm Description Refset (RDR) contains the preferred UK description for clinical use and the UK FSN for each concept It also provides synonyms acceptable for UK use This enables the RDR to be placed in a look-up table to quickly identify either the FSN or preferred UK term for a concept The UK currently also provide a language refset which holds all the current RF1 preferred terms from the UK RF1 Edition The intention is to evolve the RDR (also a subset in RF1) as the place to obtain the UK FSN and the UK recommended terms for both RF1 and RF2.
A quick peek into the data (1) All fields where data type=SCTID are primarily machine-readable For many of us, early experimentation and familiarisation will benefit from these fields being human-readable. So... Walk through and application of a ‘simple look-up’ to find the human-readable FSN and/or PT for any conceptId...
A quick peek into the data (2) You will need: –Any ConceptId of interest Could be just one Id Could be from several tables Could be all of them –The Descriptions table(s) –The NHS Realm Description RefSet table(s) –Here using Snapshots
A quick peek into the data (3) Description tables: Provide a link from each conceptId to its associated descriptions:
A quick peek into the data (4) Realm Description Refset tables - To identify the appropriate description types: Two parts of NHS RDR – together provide information on FSN and term preferences for all UK content. xder2_cRefset_NHSRealmDescriptionLanguageSnapshot_GB1000000_20131001.txt xder2_cRefset_NHSRealmDescriptionLanguageSnapshot_GB1000001_20131001.txt
A quick peek into the data (5) To identify the preferred term: “For any conceptId, give me the active preferred term as specified by the NHS RDR” SELECT Descriptions.conceptId, Descriptions.term FROM Descriptions, NHSRDR WHERE Descriptions.id = NHSRDR.referencedComponentId AND NHSRDR.acceptabilityId = '900000000000548007‘ AND Descriptions.typeId = '900000000000013009' AND Descriptions.conceptId = ‘any ConceptId’ AND Descriptions.active = 1 and NHSRDR.active = 1;
A quick peek into the data (6) To identify the FSN: “For any conceptId, give me the active FSN term as specified by the NHS RDR” SELECT Descriptions.conceptId, Descriptions.term FROM Descriptions, NHSRDR WHERE Descriptions.id = NHSRDR.referencedComponentId AND NHSRDR.acceptabilityId = '900000000000548007‘ AND Descriptions.typeId = '900000000000003001' AND Descriptions.conceptId = ‘any ConceptId’ AND Descriptions.active = 1 and NHSRDR.active = 1;
A quick peek into the data (7) Application –By, for example, the use of additional Tables or Views based on these joins for all Concepts, it should be easier to inspect the contents of any tables. Create view: –e.g. CREATE VIEW “PT" AS SELECT Concepts.id, Descriptions.term FROM Concepts, Descriptions, NHSRDR WHERE... As required, join ‘PT.id’ to each SCTID type field in: –RefSets –Relationships –etc.
A quick peek into the data (8) For Relationships: SELECT r.sourceId, pt1.term, r.typeId, pt2.term, r.destinationId, pt3.term FROM Relationships r, PT pt1, PT pt2, PT pt3 WHERE r.sourceId = pt1.id AND r.typeId = pt2.id AND r.destinationId = pt3.id AND r.active = 1;
A quick peek into the data (8) For RefSets (e.g. the ‘Family history simple reference set’ – refsetId=‘999000771000000106’ ): SELECT s.refsetId, s.referencedcomponentId, PT.term FROM simplerefset s, PT WHERE s.referencedcomponentId = PT.id AND s.refsetId='999000771000000106' AND s.active = 1;
MetaData Refsets Module Dependency Refset –Provides data on which releases a particular release requires Refset Descriptor –Provides details on the pattern of a particular refset Refset Metadata language –Preferred terms for the metadata concepts
Some of the RF2 characteristics Concepts can now change who manages them without changing conceptId –Concept Origin (namespace within sctId) –Managing organisation (moduleId) Metadata: namespaces, statuses etc all are concepts Refsets – flexible way of providing additional characteristics on a concept