Greg Steffens Noumena Solutions

Slides:



Advertisements
Similar presentations
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
Advertisements

Pascal Syntax. What you have learnt so far writeln() and write(); readln() and read(); variables (integers, strings, character) manipulation of variables.
Regression testing Tor Stållhane. What is regression testing – 1 Regression testing is testing done to check that a system update does not re- introduce.
Gregory Steffens Novartis Associate Director, Programming NJ CDISC Users’ Group 17 April 2014 Supplemental Qualifiers.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
© 2011 Octagon Research Solutions, Inc. All Rights Reserved. The contents of this document are confidential and proprietary to Octagon Research Solutions,
Implementation of a harmonized, report-friendly SDTM and ADaM Data Flow General by Marie-Rose Peltier Experience by Marie Fournier Groupe Utilisateurs.
Quantitative Stock Selection: FACTSET dynamic weights implementation Stefan D. Gertsch Version 1 April 14, 2005.
Chapter 06 (Part I) Functions and an Introduction to Recursion.
1 Archiving Michael J. Levin Harvard Center for Population and Development Studies
Introduction to Using the Data Step Hash Object with Large Data Sets Richard Allen Peak Stat.
The Use of Metadata in Creating, Transforming and Transporting Clinical Data Gregory Steffens Director, Data Management and SAS Programming ICON Development.
Common Mistakes in Writing Project Report By: COIT Final Year Project Committee.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
® IBM Software Group © 2006 IBM Corporation JSF Rich Text Area Component This Learning Module describes the use of the JSF Rich Text Area component – for.
Microsoft Excel 2013 Chapter 9 Formula Auditing, Data Validation, and Complex Problem Solving.
SAS ® Global Forum 2014 March Washington, DC.
CSC 108H: Introduction to Computer Programming
VAdata Tools VAdata: Virginia’s Sexual and Domestic Violence Data Collection System.
Understanding NMC allegations data, and developing a coding frame to categorise future allegations data Rob Francis Matt Reynolds March 2017 Restricted.
John D. McGregor Session 9 Testing Vocabulary
Oracle Subledger Accounting
Programming Standards and Practices
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
OF COURSE I DON'T LOOK BUSY... I DID IT RIGHT THE FIRST TIME
Chapter 7 Part 1 Edited by JJ Shepherd
Week 4 Object-Oriented Programming (1): Inheritance
Chapter 5: Looping Starting Out with C++ Early Objects Seventh Edition
John D. McGregor Session 9 Testing Vocabulary
Lecture 07 More Repetition Richard Gesick.
Algorithm and Ambiguity
LOOPS.
MAKE SDTM EASIER START WITH CDASH !
User-Defined Functions
Sentinel logic, flags, break Taken from notes by Dr. Neil Moore
Creating ADaM Friendly Analysis Data from SDTM Using Meta-data by Erik Brun & Rico Schiller (CD ) H. Lundbeck A/S 13-Oct
Built by Schools for Schools
Some ways to encourage quality programming
Quality Control of SDTM Domain Mappings from Electronic Case Report Forms Noga Meiry Lewin, MS Senior SAS Programmer The Emmes Corporation Target: 38th.
Classes In C#.
Chapter 8: Introduction to High-Level Language Programming
John D. McGregor Session 9 Testing Vocabulary
Designing and Debugging Batch and Interactive COBOL Programs
Exception Handling Chapter 9.
Chapter 7: Macros in SAS Macros provide for more flexible programming in SAS Macros make SAS more “object-oriented”, like R Not a strong suit of text ©
Conditions and Ifs BIS1523 – Lecture 8.
Sentinel logic, flags, break Taken from notes by Dr. Neil Moore
PROC DOC III: Self-generating Codebooks Using SAS®
Guide To UNIX Using Linux Third Edition
MSIS 670 Object-Oriented Software Engineering
Programming Logic and Design Fourth Edition, Comprehensive
A SAS macro to check SDTM domains against controlled terminology
Dynamic Data Structures and Generics
Coding Concepts (Basics)
3 Iterative Processing.
Architecture + system-based How to assign passwords
Programming.
Algorithm and Ambiguity
Topics Introduction to Value-returning Functions: Generating Random Numbers Writing Your Own Value-Returning Functions The math Module Storing Functions.
Regression testing Tor Stållhane.
Spreadsheets, Modelling & Databases
Mutation Testing The Mutants are Coming! Copyright © 2017 – Curt Hill.
Passing Simple and Complex Parameters In and Out of Macros
3 Views.
HP Quality Center 10.0 The Test Plan Module
Running a Java Program using Blue Jay.
While Loops in Python.
Unit J: Creating a Database
OBSERVER DATA MANAGEMENT PRINCIPLES AND BEST PRACTICE (Agenda Item 4)
Presentation transcript:

Greg Steffens Noumena Solutions noumena.solutions@gmail.com Meta-Programming Greg Steffens Noumena Solutions noumena.solutions@gmail.com

What’s it all about? Understand the objectives … or fail To attain higher levels of data quality; more quickly; more consistently; more transparently; that protects patients’ privacy and enables meta-analyses across studies, TAs and drug companies to evaluate safety and efficacy Data standards are a means to that end, not the end itself – it’s just one component of a larger and much more important plan to get to our goals. Automation is another key component of the solution and automation used metadata and meta-programming We need to get out of the 1980’s world of single-use programing copied study to study

An Example Project – Creating Define Files Recently completed a project to create define files for a company submitting to the FDA Outstanding feedback about the quality and time required, from a company organizing the eCTD … best defines they saw! Trained the programmer in one hour phone call and a follow-up call Goes far beyond commercial define file generators in content and automation An enhanced version of the technology I provided for both CDISC pilot projects

Steps in the Process Create metadata describing each study data library Enter a small amount of information in update metadata – primary key flags and derivation descriptions Apply updates and generate the define files – version 1.0 and 2.0 Validate the define files using xerces and pinnacle 21 Had first define files the first day of the project

Meta-programming to Create Technology I defined “meta-programming” in my previous presentation “Maximizing Code Reuse with Meta- Programming” Multi-use code that can be used across and wide scope of projects without modification, using very minimal assumptions about the data Write code to underlying patterns and problems, not to surface-level operational processes. This is what “noumena” means, as contrasted to “phenomena” Multi-use meta-programs generate single-use programs that meets all the objectives stated earlier Multi-use validation also saves study programming time in reducing the need for study-level validation of double programming, etc.

Required Skills to be a Meta-Programmer Need to be one of the best study programmers Need to know systems design principles of structured programming and relational data design for metadata and definition data The ability to see those underlying patterns and write code to those patterns A much deeper knowledge of SAS that required by study programming. Don’t expect study programming techniques to be able to meet the requirements of meta- programming

Names Single-use programs hard-code names of variables and data sets The next step is to parameterize names and use macro variable substitution in the code. This will enable the use of your code on a wider scope of data, even when variable names change Now assign a default value to the macro parameters so that the parameter does not need to be specified in typical copies. if aestrtdt >= randdt then teae = 1; else teae = 0; If &aestrtdt >- &randdt then &teae = 1; else &teae = 0; %macro teae (aestrtdt=aestrtdt,randdt=randdt,teae=teae);

Names Even Better Not quite to the end yet … This approach is a good step but when variable names are non-standard the programmer calling the macro needs to specify the same information repeatedly So, save the names in a data set … now we are starting some simple metadata data _null_; set metadata (where = (dsn=‘AE’); call symputx(‘aestrtdt’,aestrtdt,’l); run;

Confirmation and “Bullet-Proofing” What if the user is wrong? proc contents data = &data out = contents noprint; run; data _null_; if _n_ = 1 then has_aestrtdt = 0; if eof then call symputx(‘has_aestrtdt’,has_aestrtdt,’L’); set contents end = eof; if upcase(name) = upcase(“&aestrtdt”) then has_aestrtdt = 1; retain has_:; %if ^ &has_aestrtdt %then %let aestrtdt =; Take extra time to add these kinds of checks to make the meta-program easier to use

What Else Can Parameters and Metadata Do? Specify whether something should be done or not, with a Boolean parameter Do you want to include SAS code snippets in the define file? Specify which of several variations you want Define_version 1.0 or 2.0 Where is the input? MDLIB - Libref where metadata resides Where should the output be created? OUTXML – location of the define file to be created

Multi-Use of Metadata Remember to look to underlying patterns Now that the metadata exists, what other uses can we put it to Creating the data sets with a DTE macro Data edit checks as data is collected, but checking data extracts and iterations against the specifications

Processing codelist ranges The define file supports codelists that include numeric ranges These ranges may incorrectly overlap each other The programming to look for overlapping ranges requires a double SET statement that is not usually needed in study programming Looking to underlying patterns it can be realized that a range check macro can be used for several other data deisgns AE and CM data sets with start and end dates defining ranges that may have incorrect overlaps Proc format cntlin data set includes the START and END variables defining ranges