The Use of Metadata in Creating, Transforming and Transporting Clinical Data Gregory Steffens Director, Data Management and SAS Programming ICON Development.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Communicating with Standards Keeping it Simple Pamela Ryley Vertex Pharmaceuticals, Inc. September 29, 2006.
Gregory Steffens Associate Director, Programming Novartis Row – Level Metadata.
John Porter Why this presentation? The forms data take for analysis are often different than the forms data take for archival storage Spreadsheets are.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
Methodology Conceptual Database Design
Mapping Physical Formats to Logical Models to Extract Data and Metadata Tara Talbott IPAW ‘06.
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
Gregory Steffens Novartis Associate Director, Programming NJ CDISC Users’ Group 17 April 2014 Supplemental Qualifiers.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
23 August 2015Michael Knoessl1 PhUSE 2008 Manchester / Michael Knoessl Implementing CDISC at Boehringer Ingelheim.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Education Supported by Content Management Systems Milena Stanković, Milan Rajković, Ivan Petković, Petar Rajković Faculty of Electronic Engineering, Niš.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
Financial Statement Modeling & Spreadsheet Engineering “Training in spreadsheet modeling improves both the efficiency and effectiveness with which analysts.
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
Implementation of a harmonized, report-friendly SDTM and ADaM Data Flow General by Marie-Rose Peltier Experience by Marie Fournier Groupe Utilisateurs.
Antje Rossmanith, Roche 14th German CDISC User Group, 25-Sep-2012
Confidential - Property of Navitas Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Methodology - Conceptual Database Design Transparencies
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Methodology Conceptual Databases Design
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
Design Patterns 1 FME UC 2007 Design Patterns FME Workbench.
Interfacing Registry Systems December 2000.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
File Systems (1). Readings r Reading: Disks, disk scheduling (3.7 of textbook; “How Stuff Works”) r Reading: File System Implementation ( of textbook)
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
Methodology: Conceptual Databases Design
Bus Architecture. Value Chain Identifies the natural logical flow of an organization’s primary activities Operational source systems produce snapshots.
Introduction to Databases Trisha Cummings. What is a database? A database is a tool for collecting and organizing information. Databases can store information.
Conceptual Database Design
Research Project on Metadata Extraction, Exploration and Pooling: Challenges and Achievements Ronald Steinhau (Entimo AG - Berlin/Germany)
New Ideas for IA Readings review - How to manage the process Content Management Process Management - New ideas in design Information Objects Content Genres.
Methodology - Conceptual Database Design
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Implementation Experiences METIS – April 2006 Russell Penlington & Lars Thygesen - OECD v 1.0.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Business and Information Technology Working Together for the Regulator Stephen Hord, Director of Product Development – UBmatrix.
Unit 18 Advanced Database Design
SDMX IT Tools Introduction
Flat Files Relational Databases
Session 1 Module 1: Introduction to Data Integrity
Databases Flat Files & Relational Databases. Learning Objectives Describe flat files and databases. Explain the advantages that using a relational database.
April ADaM define.xml - Metadata Design Analysis Results Metadata List of key analyses (as defined in change order) Analysis Results Metadata per.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Metadata Driven Clinical Data Integration – Integral to Clinical Analytics April 11, 2016 Kalyan Gopalakrishnan, Priya Shetty Intelent Inc. Sudeep Pattnaik,
1 Agenda TMA02 M876 Block 4. 2 Model of database development data requirements conceptual data model logical schema schema and database establishing requirements.
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
Spreadsheet Engineering
Methodology Conceptual Databases Design
Advanced Programing practices
Greg Steffens Noumena Solutions
Methodology Conceptual Database Design
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
An Approach to Standard Programming in a Clinical Data Repository
Relational Database Model
A SAS macro to check SDTM domains against controlled terminology
Flat Files & Relational Databases
Metadata The metadata contains
Methodology Conceptual Databases Design
The ultimate in data organization
Presentation transcript:

The Use of Metadata in Creating, Transforming and Transporting Clinical Data Gregory Steffens Director, Data Management and SAS Programming ICON Development Solutions

Data Flow Constituents Data flow is comprised of transformation and derivation –Transformation – restructures, re-attributes and integrates –Derivation – creates new values in a target database that do not exist in the source

Data Transformations Data transformations comprise a large portion of what we do –Extract from data collection/cleaning applications to a standard –Integrate ancillary data like lab, randomization, QOL, etc. –Creation of analysis database –Integrate multiple studies into an integrated database

The DTE Therefore, there is a need to define, create, validate and describe databases efficiently and with a consistently high level of quality A combination of reusable code driven by standardized metadata is the answer – the Data Transformation Engine (DTE) The code must not make any assumptions of the source and target databases, this is communicated to the code via metadata The metadata structures must be absolutely standardized across all applications and data standards Metadata is populated prescriptively and drives the data flow

Metadata Constituents A standard list of attributes to include in any description of a database or of a data standard Put in a standard set of data structures that can be read by the DTE code The attributes must be highly structured in order to be usable by DTE code

Standard Database Attributes Data Set Level –Short/long names, data set location, order Variable level –Short/long name, type, length label, primary key flag, format, value list name, suppqual flag, code/decode relationship, order, Valid values –Value list name, start/end value, short decode, long decode Descriptions –Source name, derivation description Row-level attributes –Identical to variable level attributes but for subsets of rows defined by a parameter variable value

Row-Level Metadata Necessary to fully describe tall-thin data set structures

Metadata Structure Structured content to enable programmatic access Storage structure is separate from publication structure – maximize programmatic access and user friendly access by people Also separate from entry format Maximize sharing of information within the metadata, e.g. values lists and descriptions

Some Principles of Metadata Design Rigorously standardized for all database and standard descriptions, no metadata design change for database types! Maximize structured information and programmatic access, e.g. primary keys flagged instead of listed Enter once; use many. e.g. descriptions and values Derivation logic in descriptions though

Objectives of Metadata It is critical to explicitly define the objectives. Many disagreements arise from an unstated difference in assumed objectives Objectives allow evaluation of the success of the metadata design; e.g. retrospective description for esubmission or prescriptive enabler of automation.

Objectives … Single metadata design for everything Minimize duplicate entry Include enough attributes to enable automation Separate metadata design from publication and from entry Generalize SAS macros, don’t specialize for any specific database standard

Some Automation that is Enabled Creating study data requirements from data standards Publish data requirements in several formats – pdf, xml, html, etc. Publish data requirements with different levels of detail for different user types Implement all database attributes from the requirements with one macro call Create version 5 transport files with renames and length changes of names and labels Create integrated databases Create all decode variables with one macro call

More Automation Examples Move variables in and out of suppqual domains Validate that a database follows a standard and a study requirement Compare a new data requirement to a standard or other studies Create 0-obs data sets Automate TFL shells Transform databases from any structure to any other structure … but what else do we need to do that?

Map Metadata The Next Kind of Metadata So far, we have metadata that describes databases The next step is standard map metadata to define transformation types from one database to another Map metadata associates one observation of source metadata with one or more observations of target metadata This mapping information is read by SAS macros that generate all the SAS code to create a target database from a source database Map metadata should be separate but integrated with metadata

The DTE Code Examples %dtmap –(source_mdlib=m,source_prefix=raw_,target_mdlib=m,target_prefix =target_,maplib=me,inlib=raw,outlib=sdtm,suppqual_make=yes) %mdprint –(mdlib=m,inprefix=target_,html=yes,htmlfile=c:\metadata\target_defi ne.html) %md2odm –(mdlib=m,outxml=c:\metadata\SDTM.xml,StudyName=ICR_STDM_ Standard 3.1.2,ProtocolName=SDTM,StandardName=SDTM 3.1.2,StudyDescription=SDTM Standard,defineVersion=1.0.0,odmversion=1.2,crt_prefix=def)

Advantages of the DTE Standards are a means to automation and automation is a means to efficiency and quality Much less code invented and validated for individual studies Change in skill set requirements for data flow – separate coding skills from database and clinical skills More productivity from junior staff Faster, better and cheaper.