Research Project on Metadata Extraction, Exploration and Pooling: Challenges and Achievements Ronald Steinhau (Entimo AG - Berlin/Germany)

Slides:



Advertisements
Similar presentations
Dimitri Kutsenko (Entimo AG)
Advertisements

CDISC Open Source and low-cost Solutions
Implementation of a Validated Statistical Computing Environment Presented by Jeff Schumack, Associate Director – Drug Development Information September.
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
A Coherent and Practical End-to-End Metadata Strategy using Existing Standards and Tools for Clinical Research Stephane AUGER Danone Research, FRANCE.
Managing and Analyzing Clinical Data
1 Manufacturing Solutions Division AutoCAD Electrical 2007 What’s New? AMS | CAD Consulting Group
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
WTX Overview.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Standardisation of Trial Design Definitions in CDW at Novo Nordisk
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Metadata Management – Our Journey Thus Far
Data Management: Documentation & Metadata Types of Documentation.
An innovative platform to allow translation and indexing of internet sites Localization World
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
7. German CDISC User Group Meeting Define.xml Generator ODM Validator (define.xml validation) 2010/03/11 Dimitri Kutsenko Marianne Neumann.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview.
Guide to Using Message Maker Robert Snelick National Institute of Standards & Technology (NIST) December 2005
Beyond regulatory submission - Standards Metadata Management Kevin Lee CDISC NJ Meeting at 06/17/2015 We help our Clients deliver better outcomes, so.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
PhUSE SDE, 28-May A SAS based Solution for define.xml Monika Kawohl Statistical Programming Accovion.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Class Instructor Name Date. Classroom Tips Class Roster – Please Sign In Class Roster – Please Sign In Internet Usage Internet Usage –Breaks and Lunch.
ODM-SDTM mapping Nicolas de Saint Jorre, XClinical June 20, 2008 French CDISC User Group Bagneux/Paris © CDISC & XClinical, 2008.
Antje Rossmanith, Roche 14th German CDISC User Group, 25-Sep-2012
M ETADATA OF NATIONAL STATISTICAL OFFICES B ELARUS, R USSIA AND K AZAKHSTAN Miroslava Brchanova, Moscow, October, 2014.
Confidential - Property of Navitas Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Session 4: The HANA Curriculum and Demos Dr. Bjarne Berg Associate professor Computer Science Lenoir-Rhyne University.
Using SAS® Information Map Studio
SDTM Validation Delaware Valley CDISC user network Ketan Durve Johnson and Johnson Pharmaceutical Reasearch and Development May 11 th 2009.
Alun, living with Parkinson’s disease QS Domain: Challenges and Pitfalls Knut Müller UCB Biosciences Conference 2011 October 9th - 12th, Brighton UK.
What is a schema ? Schema is a collection of Database Objects. Schema Objects are logical structures created by users to contain, or reference, their data.
Oracle Dependencies Analyzer ODA Over time, in large companies we see many Legacy systems that work with several Databases, this.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
FEA DRM Management Strategy Presented by : Mary McCaffery, US EPA.
How to improve quality control in a data conversion process? By extended usage of metadata! Dimitri Kutsenko Entimo AG - Berlin/Germany.
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
Practical Image Management for Pharma Experiences and Directions. Use of Open Source Stefan Baumann, Head of Imaging Infrastructure, Novartis.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
The Use of Metadata in Creating, Transforming and Transporting Clinical Data Gregory Steffens Director, Data Management and SAS Programming ICON Development.
Copyright © 2015, SAS Institute Inc. All rights reserved. Future Drug Applications with No Tables, Listings and Graphs? PhUSE Annual Conference 2015, Vienna.
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
1 Ontolog OOR-BioPortal Comparative Analysis Todd Schneider 15 October 2009.
IAEA International Atomic Energy Agency INIS Collection Search: Introduction and main features The Role of the International Nuclear Information System.
April ADaM define.xml - Metadata Design Analysis Results Metadata List of key analyses (as defined in change order) Analysis Results Metadata per.
Level 1-2 Trigger Data Base development Current status and overview Myron Campbell, Alexei Varganov, Stephen Miller University of Michigan August 17, 2000.
Integrating and Extending Workflow 8 AA301 Carl Sykes Ed Heaney.
Metadata Driven Clinical Data Integration – Integral to Clinical Analytics April 11, 2016 Kalyan Gopalakrishnan, Priya Shetty Intelent Inc. Sudeep Pattnaik,
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Mark Wheeldon, Formedix CDISC UK Network June 7, 2016 PRACTICAL IMPLEMENTATION OF DEFINE.XML.
1 The XMSF Profile Overlay to the FEDEP Dr. Katherine L. Morse, SAIC Mr. Robert Lutz, JHU APL
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
7. German CDISC User Group Meeting Define
ASEE 2011 Adriana Popescu Princeton University
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Definition SpecIfIcatIons
Beyond regulatory submission - Standards Metadata Management Kevin Lee CDISC NJ Meeting at 06/17/2015 We help our Clients deliver better outcomes, so.
Traceability between SDTM and ADaM converted analysis datasets
Data Management: Documentation & Metadata
The Re3gistry software and the INSPIRE Registry
Generic Statistical Business Process Model (GSBPM)
(VIP-EDC) Point 6 of the agenda
Metadata The metadata contains
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Presentation transcript:

Research Project on Metadata Extraction, Exploration and Pooling: Challenges and Achievements Ronald Steinhau (Entimo AG - Berlin/Germany)

Content  Project Goals  Pre-Requisites  Work Packages  Advanced Workflows  Conclusions and Outlook © Entimo AG | Stralauer Platz | Berlin |

Project Goals (1)  Main Goals  Support different metadata systems - SDTM, ADaM, BRIDG, custom  Explore items dependent on contexts  Accelerate mapping process  Re-use information from comparable studies  Provide support in specification creation and issue resolution (full automation is illusionary) © Entimo AG | Stralauer Platz | Berlin |

Project Goals (2)  Additional Goals  Immediate usage and classification of metadata  Advanced metadata management based on ISO for Metadata Repositories  Cross-linking between MD-Systems incl. terminology/codelists  Smart search and recommendation of attributes and mappings  Preserve history of user decisions after recommendations © Entimo AG | Stralauer Platz | Berlin |

Work Packages 1. Development Preparation 2. Specification / Modeling 3. Development 4. Test & Optimizations © Entimo AG | Stralauer Platz | Berlin |

Development Preparation  Development Environment  Eclipse Helios / Scala IDE  Advanced Libraries  Statistical analysis  Machine (“adaptive”) learning  Infrastructure - Clinical Repository  Based on relational database  Fully generic tables (free schema)  Fast, minimal redundancy  Audit trail, versioning, SAS compliance © Entimo AG | Stralauer Platz | Berlin | Missing Values Codelists Formats

Specification / Modeling  Metadata management & rules  Data analysis  Smart recommendations & history usage  Finding and applying mapping specs  Mapping / meta generator

Specification / Modeling (1) Example Workflow: Import Clinical Data  Analyze Data  Analyze data and retrieve statistical profiles  Extract all available metadata/data attributes: - Name (synonym support) - Label / Comment (Google like searches) - Profiles (statistics based searches) - Codelist analysis (context sensitive)…  Save all data in the clinical data repository  Save meta-information in the metadata repository  Keep links between data and metadata © Entimo AG | Stralauer Platz | Berlin |

Specification / Modeling (2) Example Workflow: Import Clinical Data  Provide recommendations:  Data types and their type length  Primary keys  Code lists  References to existing metadata (SDTM, BRIDG, custom)  Find attributes used in mappings  SDTM/custom domain memberships  BRIDG references © Entimo AG | Stralauer Platz | Berlin |

Example: Schema Recommendation © Entimo AG | Stralauer Platz | Berlin |

Enhanced Data Import Schema Analysis Data Import File or external DB Types, Prim.Keys, Glob.Attr. Types, Prim.Keys, Glob.Attr. Clin. Repository and/or SAS-Datasets Clin. Repository and/or SAS-Datasets Statistics and Profiles Statistics and Profiles MDR / Pool Questionnaires / Recommendations (applying rules) Questionnaires / Recommendations (applying rules) Similarity Analysis Source Selection Schema- Completion & Verification Schema- Completion & Verification Metadata Links Thick lines indicate enhanced workflow Optional assignment of metadata © Entimo AG | Stralauer Platz | Berlin |

Mapping / Meta-Generator  Finding mapping specifications  Find and recommend existing mappings  Support users with the completion (modification) of copied mappings  Tag mappings with metadata for smarter recognition  Applying mappings  Generate mapping programs  Execute mapping programs with data © Entimo AG | Stralauer Platz | Berlin |

Enhanced Data Mapping Select Mapping Source and Target Clin. Repository and/or SAS-Datasets Clin. Repository and/or SAS-Datasets Find & Recommend similar Mappings Find & Recommend similar Mappings MDR (Pool) Similarity Analysis Clone Mapping- Task(s) Create To-Do-List Mapping Completion and Execution Enhance Mapping with additional Metadata Enhance Mapping with additional Metadata Pooling Derive Metadata From Dataset Direct Metadata Selection Thick lines indicate enhanced workflow Metadata Links © Entimo AG | Stralauer Platz | Berlin |

Conclusions  Providing “smart” technical infrastructure is challenging, but necessary for complex systems  Once in place, positive effects with growing usage and stored content  Interconnected metadata systems and data provide better transparency and reusability  Contextual knowledge (e.g. drug, study) leads to improved results

Outlook  Define more metadata inter-connections  Collect time saving statistics with larger studies  Deeper Integration into entimICE Embrace the new principle “analyse recommend re-use”!

© Entimo AG | Stralauer Platz | Berlin | End Thank you for your attention! Questions?