Presentation is loading. Please wait.

Presentation is loading. Please wait.

Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An.

Similar presentations


Presentation on theme: "Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An."— Presentation transcript:

1 Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An Overview -

2 2 Outline: Terminology management and semantic integration at NCI NCI Enterprise Vocabulary Services NCI Thesaurus (NCIt) NCI Metathesaurus (NCI Meta) Collaborations

3 3 NCI biomedical informatics Goal: A virtual web of interconnected data, individuals, and organizations redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise

4 4 in·ter·op·er·a·bil·i·ty ability of a system...to use the parts or equipment of another system Source: Merriam-Webster web site interoperability ability of two or more systems or components to exchange information and to use the information that has been exchanged. Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990] Interoperability Semantic interoperability Syntactic interoperability Courtesy: Charlie Mead

5 5 No Controlled Terminology? No Interoperability Systems cannot exchange or use information if they use incompatible codes or tokens to signify meaning Terminology services provide token and codes Proper use of them assures consistent meaning across the enterprise

6 6 Vocabulary for CDE specification Dictionary, thesaurus, ontology services via caBIO API Domain object metadata Common data elements Public APIs Common data elements (CDEs) Can it be done? caCORE - An Example via downloads

7 7 Information integration Cross- discipline reasoning cancer Common Ontologic Representation Environment (caCORE) biomedical objects common data elements controlled vocabulary

8 8 Common Data Elements Structured data reporting elements Precisely defining the questions and answers What question are you asking, exactly? What are the possible answers, and what do they mean? biomedical objects common data elements controlled vocabulary

9 9 Biomedical Information Objects Data service infrastructure developed using OMG’s Model Driven Architecture approach Object models expressed in UML represent actual biomedical research entities such as genes, sequences, chromosomes, sequences, cellular pathways, ontologies, clinical protocols, etc. The object models form the basis for uniform APIs (Java, SOAP, HTTP-XML, Perl) that provide an abstraction layer and interfaces for developers to access information without worrying about the back-end data stores biomedical objects common data elements controlled vocabulary

10 10 Binding Data, Metadata to Terminology - caCORE SDK UML Modeling Tool (provided by user) Information model that will define data classes, attributes and relationships Semantic Connector Annotate UML model with ontology concepts: bridges the world of databases to that of structured semantics. UML Loader (run by NCI staff) Loads model into the caDSR metadata registry Model and associated semantics are available at runtime Code Generator Model and a code template are inputs into generator Creates the ‘caCORE-like’ n-tier software system with Java and Web Services APIs

11 11 caCORE SDK

12 12 Extending Interoperability Beyond the Enterprise cancer Biomedical Informatics Grid (caBIG) Common, widely distributed infrastructure permits cancer research community to focus on innovation Shared vocabulary, data elements, data models facilitate information exchange Collection of interoperable applications developed to common standard Raw cancer research data is available for mining and integration

13 13 caBIG - facilitate sharing of infrastructure, applications, and data

14 14 Cancer Center Cancer Center NCI caGrid OTHER caBIG SERVICE PROVIDERS OTHER TOOLKITS

15 15 caGRID GUIAdmin Security caBIO caDSR EVS caBIG Dataresource … caARRAY Other caBIG DataResource Data source exposed as objects Well-defined objects using caDSR / EVS Mobius GME for schemas Metadata identifies services, objects exposed, relationships between objects, relationships between services Standard Grid interfaces Standard query language and interface Advertisement and Discovery Security Invocation / Schedule Execution / coordination Identifiers rProteomics Other caBIG Analysis tool Grid client API Globus Resource API OGSA-DAI caBIG Analytical Service Registry QueryInvocation GRAM

16 16 caGrid Standard Service Metadata  Common Metadata describes generic information about service providing Cancer Center  Data Service Metadata describes the data exposed using terminology and objects from caDSR/EVS  Analytical Service Metadata describes the supported operations and their inputs and outputs using terminology and objects from caDSR/EVS

17 17 Enterprise Vocabulary NCI Metathesaurus (Cross-map standard vocabularies/ontologies, e.g. SNOMED, MedDRA, ICD) Semantic integration, inter-vocabulary mapping UMLS Metathesaurus extended with cancer-oriented vocabularies  930,000 Concepts, 2,200,000 terms and phrases  Mappings among over 50 vocabularies NCI Thesaurus Description logic-based 48,000 “Concepts”  Concept is the semantic unit  Terms are Concept labels – synonymy  Semantic relationships between Concepts Other standard terminologies MedDRA, MGED, SNOMED, GO, etc. biomedical objects common data elements controlled vocabulary

18 18 NCI builds on EVS via caCORE Infrastructure

19 19 Production EVS Servers in caCORE

20 20 Enterprise Vocabulary Services Services and resources that address NCI's needs for controlled vocabulary http://www.nci.nih.gov/EVS http://www.nci.nih.gov/EVS A collaboration NCI Office of Communications  Physician Data Query (PDQ), Cancer Information Service and the NCI web portal www.cancer.gov NCI Center for Bioinformatics  Bioinformatics Core Infrastructure (caCORE), including metadata repository (caDSR) and object models built using EVS terminology for core semantics

21 21 NCI EVS Goal – Integration by Meaning Clinical, translational, and basic research terminology have overlapping but specialized needs, therefore EVS assists to:  Integrate different conceptual frameworks  Create terminological and taxonomic conventions across systems Vocabulary Products NCI Thesaurus – an ontology-like terminology NCI Metathesaurus – maps vocabularies External vocabularies maintained and served: MedDRA, HL7, NDF-RT, LOINC, etc.

22 22 Terminology Development Guidelines Develop a content model Leverage existing sources where appropriate (VA NDF-RT, RxNorm, LOINC, etc. …) Develop unique content where needed (Cancer genes and diagnoses, drugs and therapies, molecular abnormalities, clinical trial standard terminology etc.) Link to other information sources and standards using URLs as possible (GO, Swissprot, drug formularies, trial protocols) Federate, merge or map with other standard terminology for semantic integration

23 23 NCI Thesaurus (NCIt) Reference Terminology for NCI, Partners A Federal Standard Terminology Broad coverage of the cancer research and clinical domain including prevention and treatment trials Neoplastic and other Diseases Findings and Abnormalities Anatomy, Tissues, Subcellular Structures Agents, Drugs, Chemicals Genes, Gene Products, Biological Processes Animal Models – Mouse, other Research techniques and management, apparatus, clinical and lab, radiology, imagery

24 24 NCI Thesaurus (2) Published Monthly Public domain, open content license Available on-line and by download (OWL, Ontylog XML, flat files) 48,000+ “Concepts” hierarchically organized Description-logic based “Roles” establish machine readable semantic relationships between Concepts, ex.: “Carcinoma” Clinically_associated_with “Lytic Bone Lesions,” “TP53” Gene_associated_with_Disease “Breast Carcinoma”

25 25 NCI Thesaurus is Deployed: http://nciterms.nci.nih.gov http://www.nci.nih.gov/EVS (full documentation) API: caCORE public access Fulfills NCI and collaborators’ needs for controlled vocabulary Public domain, open content license

26 26 Example Concept Details Concept Details URI: http://nciterms.nci.nih.gov:80/NCIBrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&code=C19151 Version: August 2005 (05.09e) Metastasis Identifiers: name Metastasis code C19151 Relationships to other concepts: Biological_Process_Has_Result_Biological_Process Tumor Expansion Biological_Process_Has_Initiator_Process Pathologic Process Information about this concept: Synonym MET Synonym metastasis Synonym Tumor Cell Migration Synonym with source data Metastasis|PT|CADSR Synonym with source data MET|AB|CADSR Synonym with source data Tumor Cell Migration|SY|NCI Synonym with source data Metastasis|PT|NCI Synonym with source data metastasis|SY|NCI-GLOSS|CDR0000046710 NCI_META_CUI CL001192 CL001192 Semantic_Type Phenomenon or Process Related_Lash_Concept metastasis Preferred_Name Metastasis DEFINITION NCI|Metastasis is the spread or migration of cancer cells from one part of the body (the organ in which it first appeared) to another. The secondary tumor contains cells that are like those in the original (primary) tumor. For example, breast cancer cells may spread (metastasize) to the lungs and cause the growth of a new tumor. When this happens, the disease is called metastatic breast cancer. (NCI) Synonym Metastasis DEFINITION NCI-GLOSS|(meh-TAS-ta-sis) The spread of cancer from one part of the body to another. A tumor formed from cells that have spread is called a secondary tumor, a metastatic tumor, or a metastasis. The secondary tumor contains cells that are like those in the original (primary) tumor. The plural form of metastasis is metastases (meh-TAS-ta-seez). Superconcepts: Cancer Progression Subconcepts: Distant Metastasis Intravascular Metastasis

27 27 Other Examples : Use URI to view Details of a Drug Concept- http://nciterms.nci.nih.gov:80/NCIBrow ser/ConceptReport.jsp?dictionary=NCI_ Thesaurus&code=C620 http://nciterms.nci.nih.gov:80/NCIBrow ser/ConceptReport.jsp?dictionary=NCI_ Thesaurus&code=C620 Use GUI to search for and view hierarchy Http://nciterms.nci.nih.gov Http://nciterms.nci.nih.gov Fluvastatin Sodium

28 28 NCI Metathesaurus: Filtered UMLS Metathesaurus extended with additional required vocabularies 930,000+ concepts, 2,200,000 terms and phrases with definitions Mappings among over 50 vocabularies Extensive synonymy: Over 40,000 terms for neoplasms mapped to 7,000 concepts Used as online dictionary and thesaurus, for mapping and document indexing

29 29 NCI Metathesaurus (2) Minor releases monthly, Major releases twice a year Provides a mapped overlap and partial inter- relation of current versions of NCI and partner required vocabularies, ex. The ICD’s, MedDRA, SNOMED, MeSH (NLM Medical Subject Headings), HCPCS (procedures), LOINC (lab values), drug terminologies (VA NDF-RT, AOD, RxNORM, Multum, NCI Thesaurus drugs, etc.)

30 30

31 31 EVS Products & Services Are Open NCI Thesaurus is Open Contnent ftp://ftp1.nci.nih.gov/pub/cacore/EVS/ThesaurusTer msofUse.htm ftp://ftp1.nci.nih.gov/pub/cacore/EVS/ThesaurusTer msofUse.htm NCI Metathesaurus is Mostly Open Source See Each Source’s License http://ncimeta.nci.nih.gov/MetaServlet/GenerateSour cesServlet NCI EVS Servers Are Freely Accessible On the Web : Via API : All Software Developed by NCI EVS is Public Open Source and Free for the Asking: http://nciterms.nci.nih.gov and http://ncimeta.nci.nih.gov http://ncicb.nci.nih.gov/core/caBIO http://ncicb.nci.nih.gov/core

32 32 EVS Collaborations Many Active Collaborations Federal: FDA, VA, CDC, and Various NIH Institutes such as NHLBI, NIDCR Major Standards Organizations: HL7, CDISC, W3C, FHA Cancer Centers and Cancer Cooperative Groups (caBIG, caGRID) Numerous Research collaborators such as the Microarray Gene Expression Data Society (MGED Ontology, FuGO)

33 33 Areas of Collaboration FDA (Terminology for Drugs, Devices, and Clinical Trial Terminology Initiatives) VA (Drugs, Common Clinical Trials Semantics, Terminology Operations) CDC (Cancer Incidence and Prevention, Terminology Operations) Cancer Centers (Clinical Trials, Experimental Organism Terminology, Micro- nutrients, Open Terminology Servers, other (caBIG)) CDISC/HL7 RCRIM (Clinical Research Data Standards)

34 34 Contact: Frank Hartel, PhD NCI Center for Bioinformatics hartel@mail.nih.gov


Download ppt "Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An."

Similar presentations


Ads by Google