Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 caBIO ECCF Pilot Konrad Rokicki ICR Workspace Call July 28, 2010.

Similar presentations


Presentation on theme: "1 caBIO ECCF Pilot Konrad Rokicki ICR Workspace Call July 28, 2010."— Presentation transcript:

1 1 caBIO ECCF Pilot Konrad Rokicki ICR Workspace Call July 28, 2010

2 2 What is caBIO? Repository of molecular annotations loaded with data from many different sources Currently exposes data using many APIs and interfaces: SDK-generated: Java API, REST API, SOAP API Grid Data Service Python API Portlet History as a pilot project: First caCORE generated system First silver-level grid service First caGrid portlet First CBIIT iPhone app

3 3 Goals for the Pilot Project Leverage caBIO as a reference implementation of the NCI CBIIT ECCF Develop a set of ECCF-based Molecular Annotation Service specifications Implement and deploy a service based on service specifications Provide guidelines to assist other NCI CBIIT products in leveraging ECCF processes and developing ECCF artifacts Provide input on the ECCF Implementation Guide Develop guidelines that are pragmatic and useful Identify list of tools and infrastructure that will assist in the development of services and specifications

4 4 Team caBIO Team Juli Klemm Sharon Gaheen Jim Sun Liqun Qi Konrad Rokicki ECCF Mentoring Baris Suzek Brian Davis Raghu Chintalapati

5 5 ECCF Artifact Matrix Enterprise/ Business Viewpoint Information Viewpoint Computational Viewpoint Engineering Viewpoint Computation Independent Model (CIM) Platform Independent Model (PIM) Platform Specific Model (PSM)

6 6 RM-ODP Viewpoints Enterprise/Business Viewpoint Purpose / Scope Business cases /Storyboards Industry standards Information Viewpoint Information Models (DAM, PIM, PSM) Semantic Profiles Computational Viewpoint Capabilities / Operations Functional Profiles Engineering Viewpoint Non-functional Requirements Deployment model

7 7 Levels of Abstraction Computation Independent Model (CIM) Service Scope and Description Document CIM Service Specification Document (CIMSS) Platform Independent Model (PIM) PIM Service Specification Document (PIMSS) Platform Specific Model (PSM) PSM Service Specification Document (PSMSS) Service Integration Guide Implementation Deployable System

8 8 Enterprise Service Specification Process

9 9 Project Plan

10 10 Scope and Service Description “The Molecular Annotation Service provides a set of interfaces for the annotation of experimental or other types of data with molecular information. ” “The purpose of Molecular Annotations service is to provide specifications for a set of molecular annotations that may be integrated with user-facing applications.” “The development of a common, reusable set of interfaces provided by this service will facilitate standardization, integration, and interoperability between various systems that provide and consume molecular annotations.”

11 11 Mapping to LSBAM LS BAM Use CaseService Mapping Description Characterize/Organize the Data The molecular annotations service supports the Characterize/Organize the Data use cases by providing annotations for molecular entities associated with data. For example, in characterizing experimental data, a researcher may look up reference annotations with the service to find which genes are mapped to the microarray used in the experiment. Integrate Data SetsThe molecular annotations service supports the Integrate Data Sets use case as it will provide the capability of retrieving annotations from the service to use as join points, or to display as an additional reference. Annotate Findings/Results The molecular annotations service supports the Annotate Findings/Results use case as the service provides direct support for obtaining information associated with molecular entities to assist in annotating findings/results. Identify and Review Knowledge Bases and /or Databases The molecular annotations service supports the Identify and Review Knowledge Bases and/or Databases use case as the service provides support for knowledge discovery via the integration of annotations across disparate data sources.

12 12 Business Storyboards OutlineBioinformatics developer wants to retrieve all diseases and agents associated with a target gene DetailJohn Smith is developing a web site that allows researchers to find all of the diseases associated with a specific gene. The site will also allow researchers to select a gene and obtain a list of agents (drugs) used to target that gene. By querying the molecular annotations service, John’s web application can retrieve a list of diseases and agents associated with a gene.

13 13 Scope ItemsScope / Out of Scope Source Provide the ability to retrieve molecular annotations ScopeMolecular Annotation Service Scope and Description Provide the ability functional associations, cellular locations, and biological processes associated with a gene ScopeMolecular Annotation Service Scope and Description Provide the ability to retrieve disease and agents associated with a gene ScopeMolecular Annotation Service Scope and Description Provide the ability to retrieve variations associated with a gene ScopeMolecular Annotation Service Scope and Description ………

14 14 Semantic Profiles Semantic Profile No. Semantic Profile Name Constrained Information Model Semantic Profile Description MA-SP1Molecular Annotation Domain Analysis Model LSDAM v1.1 The molecular annotation service will use semantics from the Life Science DAM. The following classes are included in the project-specific DAM (grouped by sub- domain):  Gene  NucleicAcidSequenceFeature  MolecularSequenceAnnotation  GeneticVariation  SingleNucleotidePolymorphism  NucleicAcidPhysicalLocation  …

15 15 Project Analysis Model

16 16 Capabilities NameDescription Get Gene By Symbol or Alias Returns the gene named by the specified gene symbol or gene alias Get Gene By Microarray Reporter Returns the gene associated with the specified microarray reporter Get Functional AssociationsReturns annotations describing a gene's molecular function Get Cellular LocationsReturns annotations describing a gene's location within a cell Get Biological ProcessesReturns annotations describing a gene's role in biological processes Get Disease AssociationsReturns findings about a gene's role in diseases Get Agent AssociationsReturns findings about agents which target a given gene Get Structural VariationsReturns variations which are located on a given gene Get Homologous GeneReturns a gene’s homologous gene in a specified organism

17 17 Capability Details Name [M]Get Gene By Symbol or Alias Description [M]Returns the gene named by the specified gene symbol or gene alias and the gene’s organism Pre-Conditions [M]None Security Pre-Conditions [M]None Inputs [M]Gene Symbol or Alias Organism Identifier Outputs [M]A collection of Gene objects Post-Conditions [O]None Exception Conditions [M]No matching genes found Aspects left for Technical Bindings [O] Format and data type for the Organism Identifier Notes [O]NA

18 18 Functional Profiles Functional Profile No. Functional Profile Name Functional Profile Description Capability Names MA-FP1Gene Annotation Query Profile Contains all the capabilities for retrieving gene annotations  Get Gene By Symbol or Alias  Get Gene By Microarray Reporter  Get Functional Associations  Get Cellular Locations  Get Biological Processes  Get Disease Associations  Get Agent Associations  Get Structural Variations  Get Homologous Gene

19 19 Conformance Profiles Conformance NoMA-CP1 Conformance Name LSDAM-based Gene Annotation Conformance Profile DescriptionThis conformance profile defines the functionality for the Gene Annotation Service using LSDAM semantics Usage ContextThis profile would be used by a researcher wishing to access gene annotations MandatoryNo Functional Profile(s) MA-FP1 : Gene Annotation Query Profile Semantic Profile(s)MA-SP1 : LSDAM v1.1

20 20 Activity Diagrams

21 21 Conformance Statements NameTypeViewpointDescriptionTest method Query Performance ObligationEngineeringThe MA service should provide a response within 0.5 seconds to support a synchronous UI based client Test cases to include performance testing. Additional Functionality PermissionComputationalThe MA service can provide additional functionality other than specified in these specifications Design Review Semantic Model ObligationInformationalThe MA service must provide traceability to classes in the LSDAM where applicable. Design Review Data TypesObligationInformationalThe MA service must conform to NCI’s constrained list of ISO 21090 data types. Design Review Functional Profiles ObligationComputationalFunctional Profiles shall be deployed as functional wholes. Ignoring or omitting functional behavior defined within a functional profile is not permitted, nor is diverging from the detailed functional specifications provided in Section 4. 1.Design Review 2.Test cases

22 22 ECCF Artifact Matrix Enterprise/ Business Viewpoint Information Viewpoint Computational Viewpoint Engineering Viewpoint Computation Independent Model (CIM) Platform Independent Model (PIM) Platform Specific Model (PSM)

23 23 Relation to CIM Conceptual Functional Service Specification Name Conceptual Functional Service Specification Version Description & Link to the Conceptual Functional Service Specification Molecular Annotation Computation Independent Service Specification 0.0.4 https://gforge.nci.nih.gov/svnroot/cabiodb/ECCF /artifacts/conceptual/CIMSS_Molecular_Annotati on_Service.doc Deviation from the Conceptual Functional Service Specification Reason for Deviation None

24 24 Relationship to Standards StandardsDescription LSBAM v1.0Service conforms to NCI’s Life Science Business Architecture Model LSDAM v1.1Service conforms to the Life Sciences DAM version 1.1 LSPIM v0.1Service conforms to the Life Sciences PIM version 0.1 ISO 21090Service conforms to NCI’s version of ISO 21090 data types HUGO Gene Symbols Service leverages gene symbols from the Human Genome Organization MGI Gene Symbols Service leverages gene symbols from the International Committee on Standardized Genetic Nomenclature for MiceInternational Committee on Standardized Genetic Nomenclature for Mice

25 25 What is the LSPIM? Life Science Platform Independent Model Based on the LSDAM v1.1 A PIM is derived from a DAM by following some guidelines: Constrain to relevant classes and attributes Localize by adding attributes as needed Semantics are made explicit Enumeration of value domains Resolution of type codes into class hierarchies Resolution of all many to many relationships All required compliance with other models needs to be expressed at PIM level The LSPIM is currently being developed by the Information Representation Working Group (IRWG)

26 26 Platform Independent Model PIM is based on the LSPIM but it may be constrained and localized: Add any attributes that are needed Remove attributes which are unnecessary Add associations Add new classes LSPIMMAPIM

27 27 NucleicAcidPhysicalLocation TraceAttribute NameTypeDescription LSPIMstartCoordinateINTThe start coordinate of the range (inclusive), given as an integer offset from the start of the sequence. LSPIMendCoordinateINTThe end coordinate of the range (inclusive), given as an integer offset from the start of the sequence. NewfeatureTypeCDThe type of gene feature located, e.g. GENE, CDS, UTR, RNA, PSEUDO. NewassemblySTThe genome assembly which this location is defined in reference to.

28 28 Traceability for Information Models

29 29 Operations Operation No. Operation NameInterface NameOperation Description MA-INF1- OP1 getGeneBySymbolMAGeneAnnotationQueryReturns the gene named by the specified gene symbol or gene alias MA-INF1- OP2 getGeneByMicroarrayReporterMAGeneAnnotationQueryReturns the gene associated with the specified microarray reporter MA-INF1- OP3 getFunctionalAssociationsMAGeneAnnotationQueryReturns annotations describing a gene’s molecular function MA-INF1- OP4 getCellularLocationsMAGeneAnnotationQueryReturns annotations describing a gene’s location within a cell …… ……

30 30 Operation Behavior Description Behavior Description  Client supplies a GeneSearchCriteria instance with a gene symbol or alias and an Organism to search within  The case of the symbol or alias is ignored  If the Organism is null then all Organisms are searched  The system returns the matching Gene object(s), if any Pre-ConditionsNone Security Pre- Conditions None Inputs  GeneSearchCriteria Outputs  Return:  Fully-populated instance(s) of the Gene class Post-Conditions None Exception Conditions  No matching genes found Additional DetailsNone NotesNone getGeneBySymbol Returns the gene named by the specified gene symbol or gene alias and the gene’s organism.

31 31 Search Criteria (Inputs)

32 32 Dynamic Model

33 33 Relationship with other services

34 Enterprise/ Business Viewpoint Information Viewpoint Computational Viewpoint Engineering Viewpoint Computation Independent Model (CIM) Platform Independent Model (PIM) Platform Specific Model (PSM) 34 ECCF Artifact Matrix

35 35 Relation to PIM Platform Independent Model Name and Service Specification Platform Independent Model and Service Specification Version Description & Link to the Platform Independent Model and Service Specification Molecular Annotation Service Platform Independent Model and Service Specification 0.1.0http://gforge.nci.nih.gov/svnroot/cabiodb/ECCF/artifacts/ logical/PIMSS_Molecular_Annotation_Service.doc

36 36 PSM Information Model

37 37 Service Interface Implemented Interface No. Supported Interface NameInterface DescriptionLink MA-INF1MAGeneAnnotationQueryIncludes all operations for retrieving gene annotations. N/A DS-INF1Data Service QueryContains the CQL query operation https://ncisvn.nci.nih. gov/svn/cagrid/branch es/caGrid- 1_3_release/cagrid-1- 0/caGrid/projects/data /schema/Data/DataSer vice.wsdl

38 38 Implementation 1.Leverage new releases of: ISO 21090 Common Library caCORE SDK caGrid / Introduce 2.Create new MA database and map to MA PSM 3.Populate MA database with data from caBIO database 4.Generate caCORE-like system from the MA PSM 5.Generate data service with Introduce 6.Add custom methods to implement service operations

39 39 Deployment Plan

40 40 Resources caBIO https://wiki.nci.nih.gov/display/caBIO/caBIO+Wiki+Home+Page caBIO ECCF Pilot Project https://wiki.nci.nih.gov/display/caBIO/caBIO+ECCF SAIF Book http://gforge.hl7.org/gf/download/docmanfileversion/5503/6972/saeaf_02_19_10.pdf

41 41 Questions?


Download ppt "1 caBIO ECCF Pilot Konrad Rokicki ICR Workspace Call July 28, 2010."

Similar presentations


Ads by Google