Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.

Similar presentations


Presentation on theme: "Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush."— Presentation transcript:

1 Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush

2 Session Outline  Audience Interview – What do you want to learn?  Standard Vocabularies – NCI Thesaurus as part of EVS  Metadata and Data Elements – Their differences and why we use them  caCORE Infrastructure and caDSR – How it all fits together  caDSR Tools – CDE Browser – CDE Curation Tool – Sentinel Tool  Semantic Interoperability – UML Model Browser / Semantic Integration Workbench

3 Standard Vocabularies  Facilitate translational research  Integrate diverse data systems  Improve the links between clinical research and the healthcare delivery system

4 Enterprise Vocabulary Services (EVS)  Address NCI’s needs for controlled vocabulary and semantics  Components NCI ThesaurusNCI Metathesaurus http://nciterms.nci.nih.govhttp://ncimeta.nci.nih.gov Stand-alone reference terminology Relational: Links to multiple terminologies One definition for cancer research One or more definitions from multiple sources Designed for annotation and database coding to facilitate data analysis and retrieval Designed for mapping cancer terms across terminologies throughout the cancer research community to facilitate integration

5 Use EVS  Check and compare dictionary definitions.  Find synonyms.  Determine relationships to other concepts/terms.  Identify and evaluate potential options when curating new CDEs or adding terms to permissible value lists.  Provide links to related research publications.  If you can’t find a term, you can submit a new one

6 Exercise 1 - Examine an EVS Term  Complete: Exercise 1 from the “Semantic Interoperability” exercise handout.  Time: 2 minutes

7 Exercise 1 - Examine an EVS Term 1.Navigate to the NCI Terminology Browser http:// nciterms.nci.nih.gov 2.Select the NCI Thesaurus 3.Select “Connect” 4.Enter “gene” in the Quick Search entry field, then “Gene” from the results

8 concept code

9  Metadata is data about data  Metadata describes the content, quality, condition, and other characteristics of data  Example: If a question on a form reads: “What is your age?” – What is the data? – What is the metadata? Define Metadata

10 caDSR Overview: Metadata Example: Age caDSR metadata repository Data Describes the data in What is your age?: Metadata 33 Local database stored in Person Self Reported Age (data element) Person Self Reported Age (data element concept) Age Values (value domain) Person (object class) Self Reported Age (property) Datatype: Numeric Max length: 10 Version: 2.0 High Value: 999 Low Value: 0 Type: Non-enumerated stored in

11 Data Elements  A data element is a standard way of describing and representing metadata – e.g. caDSR contains metadata based on the ISO/IEC 11179 metadata standard  “Semantically Immutable Metadata” are data elements that are made up of one or more terms from a standard vocabulary  “Semantically Interoperable Systems” base their data models (in our case, UML Class Diagrams) on metadata that is semantically immutable

12 Data Element Fundamentals DECDEC Object Class Property Data Element ConceptValue Domain Data Element += D E VDVD DECDEC VDVD Representation Term Representation Term + Object Class + Property + Rep Term = Data Element + Object Class + Property + Rep Term = Data Element

13 Data Element Fundamentals DECDEC Person Address Person AddressZip Code Person Address Zip Code += D E VDVD DECDEC VDVD Zip Code Zip Code Person Address Zip Code Person Address Zip Code

14 Libraries of Re-usable Components D E VDVD DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD DECDEC ID# 106 Person Address Zip Code

15 Libraries of Reusable Components D E VDVD DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC DECDEC VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD VDVD DECDEC ID# 106 ID# 77 Person Address State Code

16 How Data Elements are Used  On Forms for data collection (CRFs)  In Databases to describe database field attributes and constraints  In information/UML Modeling  Support APIs  To describe application user interface components, validation rules, display name and format

17 Cancer Data Standards Repository (caDSR )  Metadata repository and registry  Based on the ISO/IEC 11179 standard for metadata registries  Designed to integrate caCORE infrastructure  Supports the development and deployment of Data Elements that are used as metadata descriptors

18  ISO is a non-government network of the national standards institutes of 151 countries  ISO has standards for mathematics, manufacturing, electrical mechanical and civil engineering, imaging, electronics, and information technology  Benefits of using ISO/IEC 11179: – Metadata model fully supports the variations needed for biomedical applications – Easier to understand and share cancer research information. – http://www.iso.org/ – http://ncicb.nci.nih.gov/NCICB/core/caDSR/ISO11179 ISO: International Organization for Standardization

19 caCORE Components Enterprise Vocabulary Data Standards Bioinformatics Objects

20 caCORE Infrastructure Vocabulary for CDE specification Dictionary, thesaurus services Domain object metadata Common data elements Public APIs Common data elements (CDEs)

21 caDSR Tools: Purpose  caDSR Tools are designed to: – Create, consume, distribute and promote ISO/IEC 11179 compliant metadata – Enable semantic consistency across research domains – Support the metadata life-cycle and governance processes

22 caDSR Tools  CDE Browser / FormBuilder – Search for and Download Data Elements – Collect Data Elements onto Forms and Download Forms  CDE Curation Tool – Curate (Create and Edit) Data Element Concepts, Value Domains and Data Elements  Sentinel Tool – Create Alert Definitions to monitor changes to caDSR metadata

23 CDE Browser (Search & Download) caDSR Search Tree: Displays all the current caDSR Contexts. Users can search for groups of DEs by navigating the tree. Data Element Search Pane: This is the main search window. Users looking for Data Elements can enter a key word or phrase. Navigation Menu: use these buttons to navigate to the CDE cart, Form Builder, or back to Home( that is back to this page)

24 Exercise 2 – Examine a Data Element in the CDE Browser  Complete: Exercise 2 from the “Semantic Interoperability” exercise handout.  Time: 5 minutes

25 Exercise 2 – Examine a Data Element in the CDE Browser  Navigate to the CDE Browser – http://cdebrowser.nci.nih.gov  Select the third option, “At least one of the terms”  Enter “gene” in the search term field  Scroll down to “Gene Identifier java.lang.Long “ in the results list; select the Long Name to open the Data Element details window

26 Exercise 2 – Examine a Data Element in the CDE Browser

27  Answer the following questions: – What is the Long Name of the Data Element? – What is the Public ID of the Data Element? – What context owns the Data Element? – What is the Data Element Concept Long Name? – Are there permissible values for this Data Element?

28 Exercise 2 – Examine a Data Element in the CDE Browser  Answers: – What is the Long Name of the Data Element? Gene Name java.lang.String – What is the Public ID of the Data Element? 2223839 – What context owns the Data Element? caCORE – What is the Data Element Concept Long Name? Gene Name – Are there permissible values for this Data Element? NO

29 CDE Curation Tool (Create/Edit Metadata Using EVS)

30 CDE Curation Tool (Create/Edit Existing Metadata)

31 Sentinel Tool (Monitor Changes to Metadata) What to watch When to Watch What to watch for What to report

32 Sentinel Tool Reports (View Changes Made to Metadata) Change Blocks Associated Blocks

33 Semantic Integration Tools  UML Model Browser – Browse administered items that are part of registered UML Models – Supports browsing, searching, and exporting the classes, attributes and relationships between classes of a UML domain model  Semantic Integration Workbench – Guides users through the workflow process required for annotating a UML domain model – Tags UML Models with matching semantic concepts from the NCI Thesaurus

34 UML Model Browser  Web-based – http://umlmodelbrowser.nci.nih.gov/  Designed for UML model owners  Search for and view UML model components in caDSR – classes – class attributes – associations between classes and attributes – ISO Components (metadata) related to those classes and attributes

35 UML Model Browser Interface UML Model Search Tree: Search for model components. Basic Class/Attribute Search Pane: Users looking for classes and attributes can enter search criteria here. Basic Class/Attribute Search Pane: Users looking for classes and attributes can enter search criteria here. Navigation Menu: Access other caDSR tools and resources.

36 UML Model Browser : UML Model Search Tree  Displays current caDSR Contexts  For each Context, – lists all the UML classes – grouped by project, subproject and package  Search for classes by navigating the tree and clicking on a context, project, subproject or package  Search for attributes by clicking on a class project subproject package class

37 UML Model Browser : UML Class - Model Tree Search Results # Matches ‘crumb trail’ Class Search Results Package

38 Exercise 3 – View Classes & Attributes in the UML Model Browser  Complete: Exercise 3 from the “Semantic Interoperability” exercise handout.  Time: 5 minutes

39 Exercise 3 – View Classes & Attributes in the UML Model Browser 1.Navigate to the UML Model Browser 1. http://umlmodelbrowser.nci.nih.gov 2.Use the tree to navigate to the caCORE Project: 1. caCORE  Projects  caCORE  Cancer Bioinformatic Infrastructure Objects  gov.nih.nci.cabio.domain 3.Scroll down the list of classes, select the “Gene” class 4.Answer the following: 1. What are the two attributes in the Gene class? 2. What project does the Gene class belong to? 3. What context is this project in? 4. What is the Public ID of the “Gene Name” data element?

40 Exercise 3 – View Classes & Attributes in the UML Model Browser  Answers: – What are the two attributes in the Gene class? Gene cluterId Gene fullName – What project does the Gene class belong to? caCORE – What context is this project in? caCORE – What is the Public ID of the “Gene Name” data element? 2223839

41 Semantic Integration Workbench  http://cadsrsiw.nci.nih.gov  Audience: caCORE SDK UML Model developers/users performing semantic annotation  Performs the tasks associated with semantic annotation and review for loading of UML Models into caDSR  Benefits: – Users select NCI Thesaurus concepts or existing metadata for UML model annotation  Recommended Prerequisites – EVS terms – Enterprise Architect – UML Class Diagram as your domain model

42 SIW in the caCORE SDK Workflow 1.Design system and draw model (UML tool) 2.Perform Semantic Integration (SIW - Semantic Integration Workbench) 3.Register metadata (UML Loader) 4.Generate and deploy system (Code Generator)    

43 Using the Semantic Integration Workbench SIW Viewer Window UML Entities Mapped Concept

44 NCICB Application Support  Live Support: Monday – Friday 8 am – 8 pm Eastern Time – Telephone support is available Monday to Friday, 8 am – 8 pm Eastern Time, excluding government holidays. – You may leave a message, send an email or submit a support request via the Web at any time.  Email: ncicb@pop.nci.nih.gov  Phone: 301-451-4384  Toll-free: 888-478-4423  Web: http://ncicbsupport.nci.nih.gov

45 Questions


Download ppt "Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush."

Similar presentations


Ads by Google