CaBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS.

Slides:



Advertisements
Similar presentations
Introduction The cancerGrid metadata registry (cgMDR) has proved effective as a lightweight, desktop solution, interoperable with caDSR, targeted at the.
Advertisements

27 June 2005caBIG an initiative of the National Cancer Institute, NIH, DHHS caBIG the cancer Biomedical Informatics Grid Arumani Manisundaram caBIG - Project.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Overview of Biomedical Informatics Rakesh Nagarajan.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
CaGrid Service Metadata Scott Oster - Ohio State
Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An.
The cancer Biomedical Informatics Grid™ (caBIG™): In Vivo Imaging Workspace Projects Fred Prior, Ph.D. Mallinckrodt Institute of Radiology Washington University.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Department of Biomedical Informatics Development of Ontology-anchored Grid-based Data Services to Facilitate Integrative Clinical and Translational Science.
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Cancer Clinical Trial Suite (CCTS): An Introduction for Users A Tool Demonstration from caBIG™ Bill Dyer (NCI/Pyramed Research) June 2008.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
1 A National Virtual Specimen Database for Early Cancer Detection June 26, 2003 Daniel Crichton NASA Jet Propulsion Laboratory Sean Kelly NASA Jet Propulsion.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
CaBIG ® VCDE Workspace Tactics thru June 14, 2010: How working groups fit together, and other activities Brian Davis April 1, 2010 VCDE WS Teleconference.
Clinical Data Interchange Standards Consortium (CDISC) uses NCIt for its Study Data Tabulation Model (SDTM) and other global data standards for medical.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
1 LS DAM Overview and the Specimen Core February 16, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund,
Cancer MetaData Standards Peter A. Covitz, Ph.D. HL7 RCRIM October 1, 2002.
CaCORE Software Development Kit George Komatsoulis 25-Feb-2005.
Data Integration and Management A PDB Perspective.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
SPOREs Specialized Programs of Research Excellence Ryan Landy Qinyan Pan -SAIC 2003 NCICB Jamboree.
1 Cancer Models Database (caMOD). 2 History  January 2000 – Prototype is presented during the Mouse Models of Human Cancers (MMHCC) Steering Committee.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
- EVS Overview - Biomedical Terminology and Ontology Resources Frank Hartel, Ph.D. Director, Enterprise Vocabulary Services NCI Center for Bioinformatics.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
CaMOD – Cancer Models Database
What is NCIA? National Cancer Imaging Archive Searchable repository of in vivo cancer images in DICOM format Publicly available at no cost over the Internet.
1 Gateways. 2 The Role of Gateways  Generally associated with primary sites in ESG-CET  Provides a community-facing web presence  Can be branded as.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
caELMIR an Electronic Laboratory Management Information and Retrieval system for pre- clinical experimental data.
Protégé 3.4 Plug-in for Editing and Maintaining the NCI Thesaurus Protégé Conference June 23, 2009 Amsterdam Sherri de Coronado, Gilberto Fragoso.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Subject Registrations Adverse Events Subject Registrations Biospecimens Lab Results IN: 1. Lab Results OUT: 1. Subject Registrations 2. Clinical Notes.
Structured Protocol Representation for the Cancer Biomedical Informatics Grid: caSPR and caPRI.
High throughput biology data management and data intensive computing drivers George Michaels.
Welcome to the caBIG Community! The cancer Biomedical Informatics Grid (caBIG ® ) offers more than 120 open source tools, technologies and infrastructure.
Challenges and issues with information sharing: The four pillars of semantic interoperability Douglas B. Fridsma, MD, PhD, FACP University of Pittsburgh.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
CaCORE In Action: An Introduction to caDSR and EVS Browsers for End Users A Tool Demonstration from caBIG™ caCORE (Common Ontologic Representation Environment)
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
1 caBIG®-aligned Enterprise Metadata Infrastructure to Support Commercial Clinical Trials Management Software: A Pilot Implementation September 11, 2009.
0 caBIG and caGrid: Interoperable Computing Infrastructure for the Nation’s [and World’s] Cancer Research Enterprise Peter A. Covitz, Ph.D. Chief Operating.
Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.
Margaret Haber, RN, OCN Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute Overview of NCI Enterprise Vocabulary Services (EVS)
C3PR: An Introduction for Users A Tool Demonstration from caBIG™ Vijaya Chadaram Duke Cancer Center April 29, 2008.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Fred Prior, Ph.D. Mallinckrodt Institute of Radiology
Clinical Observation Interoperability March 18, 2008
Presentation transcript:

caBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS

NCI biomedical informatics  Goal: A virtual web of interconnected data, individuals, and organizations redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise

Trials Animal Models states context pathways ontologies agents therapeutics probes components genes genotypes gene expression proteins protein expression etiology, treatment, prevention

Molecular Pathology Clinical Trials caCORE access portals participating group nodes Cancer Genomics Mouse Models building common architecture, common tools, and common standards

Interoperability Semantic interoperability Syntactic interoperability Courtesy: Charlie Mead  in·ter·op·er·a·bil·i·ty -ability of a system...to use the parts or equipment of another system Source: Merriam-Webster web site  interoperability -ability of two or more systems or components to exchange information and to use the information that has been exchanged. Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990]

Enterprise Vocabulary  NCI Meta-Thesaurus (Cross-map standard vocabularies/ontologies, e.g. SNOMED, MEDRA, ICD) -Semantic integration, inter-vocabulary mapping -UMLS Metathesaurus extended with cancer-oriented vocabularies 800,000 Concepts, 2,000,000 terms and phrases Mappings among over 50 vocabularies  NCI Thesaurus -Description logic-based -18,000 “Concepts” Concept is the semantic unit One or more terms describe a Concept – synonymy Semantic relationships between Concepts biomedical objects common data elements controlled vocabulary

Common Data Elements  Structured data reporting elements  Precisely defining the questions and answers -What question are you asking, exactly? -What are the possible answers, and what do they mean? biomedical objects common data elements controlled vocabulary

Biomedical Information Objects  Data service infrastructure developed using OMG’s Model Driven Architecture approach  Object models expressed in UML represent actual biomedical research entities such as genes, sequences, chromosomes, sequences, cellular pathways, ontologies, clinical protocols, etc.  The object models form the basis for uniform APIs (Java, SOAP, HTTP-XML, Perl) that provide an abstraction layer and interfaces for developers to access information without worrying about the back- end data stores biomedical objects common data elements controlled vocabulary

Standards supporting infrastructure  Enterprise Vocabulary Services (EVS) -Browsers -APIs  cancer Bioinformatics Infrastructure Objects (caBIO) -Applications -APIs  cancer Data Standards Repository (caDSR) -CDEs -Case Report Forms -Object models -ISO model

Data Access Objects Object Managers Domain Objects RM I Web Server Tomcat Servlets JSPs SOAP XML XSL/XSLT HTML (Browsers) SOAP Clients Java Applications DataObject Presentation Client Integrating Architecture HTML/XML Clients Meta-Data PERL Clients

Semantic Integration: Modeling Time Class Attributes EVS Concept for Attribute ‘agentName’ EVS Concept for Class ‘Agent’ EVS Concept for Attribute ‘id’... etc. EVS Concept for instance objects Object Mapping to EVS Concepts Done at Modeling Time

Semantic Integration: Metadata Registration Time UML model, including EVS Concept mappings ISO11179 mapping caDSR loading Curation: Data standards registration for instance data

Semantic Integration: Runtime Java Applications Data Access Objects (OJB) Object Managers Web Server Tomcat Servlets ( XML XSL/XSLT ) JSPs SOAP HTML/XML Clients (Browsers) SOAP Clients Data Object PresentationClient Perl Clients Domain Objects [Gene, Disease, Concept, DataElement] RMI Research DBs Research DBs

caGRID caCORE architecture extension caBIO server caBIO client OGSA-DAI + Globus OGSA-DAI caGRID extension ( metadata ) caGRID extension (caBIO adapter) caGRID extension ( query ) Client Grid Data Source caGRID extension (Concept Discovery) caGRID extension (Federated Query) caGRID Extension (Integration of Discovery and Query Services)

NCICB applications: clincial trials support - C3DS molecular pathology - caArray cancer images - caImage pre-clinical models - caModelsDb laboratory support - caLIMS

Standards-based Data System for the conduct of clinical trials: C3D (Cancer Central Clinical Database) –WWW-based eCRF-based primary data capture by protocol C3PR (Cancer Central Clinical Participant Registry) –WWW-based Central registration of participants across protocols C3PA (Cancer Central Clinical Protocol Administration) –Scientific management system for clinical protocols C3TR (Cancer Central Clinical Tissue Repository) –Tissue repository C3DW (Cancer Central Clinical Data Warehouse) –De-identified patient information accessed via caBIO

Image Portal The NCICB has developed an image portal to allow researchers to search for mouse and human images and annotations –Human and mouse images and annotations were provided by the MMHCC

Pathway Database Enhance value of imperfect, but available, pathway knowledge Make biological assumptions explicit Combine sources of data (e.g. KEGG, BioCarta,...) Merge data from separate pathways Build a causal framework to support (future) quantitative simulation/analysis

Cancer Biomedical Informatics Grid (caBIG)  Common, widely distributed infrastructure permits cancer research community to focus on innovation  Shared vocabulary, data elements, data models facilitate information exchange  Collection of interoperable applications developed to common standard  Raw published cancer research data is available for mining and integration

caBIG will facilitate sharing of infrastructure, applications, and data

caBIG action plan  Establish pilot network of Cancer Centers -Groups agreeing to caBIG principles -Mixture of capabilities -Mixture of contributions  Expanding collection of participants  Establish consortium development process -Collecting and sharing expertise -Identifying and prioritizing community needs -Expanding development efforts  Moving at the speed of the internet…

Three Domain Workspaces and two Cross Cutting Workspaces have been launched during the Pilot phase DOMAIN WORKSPACE 3 Tissue Banks & Pathology Tools provides for the integration, development, and implementation of tissue and pathology tools. DOMAIN WORKSPACE 2 Integrative Cancer Research provides tools and systems to enable integration and sharing of information. DOMAIN WORKSPACE 1 Clinical Trial Management Systems addresses the need for consistent, open and comprehensive tools for clinical trials management. CROSS CUTTING WORKSPACE 2 Architecture developing architectural standards and architecture necessary for other workspaces. CROSS CUTTING WORKSPACE 1 Vocabularies & Common Data Elements responsible for evaluating, developing, and integrating systems for vocabulary and ontology content, standards, and software systems for content delivery

Key deliverables of caBIG pilot  Componentized, standards-based Clinical Trials Management System -e-IND filing/regulatory reporting with FDA -Electronic management of trials -Integration of diverse trials  Tissue Management System -Systematic description and characterization of tissue resources -Ability to link tissue resources to clinical and molecular correlative descriptions  “Plug and Play” analytic tool set -microarray -proteomics -pathways -data analysis and statistical methods -gene annotation  Diverse library of raw, structured data

Cancer Molecular Analysis Project (CMAP) - a prototypic biomedical data integration effort biomedical objects common data elements controlled vocabulary Profiles, Targets, Agents, Clinical Trials CGAP NCBI UCSC (via DAS) BioCarta KEGG Gene Ontologies CTEP clinical trials CGAP gene expression NCI drug screening

caBIG community contributions  Infrastructure -Ontologies -Databases  Applications -Clinical trials support -Analytic tools -Data mining  Data -Trials -Experimental outcomes Genomic Microarray Proteomic

acknowledgements  NCICB -Peter Covitz -Sue Dubman -Mary Jo Deering -Leslie Derr -Carl Schaefer -Christos Andonyadis -Mervi Heiskanen -Denise Hise -Kotien Wu -Fei Xu -Frank Hartel  LPG/CCR -Michael Edmundson -Bob Clifford -Cu Nguyen