DRAFT Richard Chandler-Mant – R Consultant The Challenges of Validating R Managing R in a Commercial Environment
DRAFT Richard Chandler-Mant – R Consultant What does validation mean Why validate R The challenges of validating R Validation tools ValidR Questions Agenda
DRAFT Richard Chandler-Mant – R Consultant What does validation mean? “Establishing documented evidence which provides a high degree of assurance that a specific process will consistently produce a product, meeting its predetermined specifications and quality attributes” U.S. Food and Drug Administration (2013). 21 CFR Part 11: Electronic Records, Electronic Signatures,
DRAFT Richard Chandler-Mant – R Consultant Why validate R? No guarantee that packages meet specifications Must comply with industry regulations to be used in regulated environments
DRAFT Richard Chandler-Mant – R Consultant Why validate R? R Foundation response to CFR Part 11: R: Regulatory Compliance and Validation Issues A Guidance Document for the Use of R in Regulated Clinical Trial Environments 1 Document relates to only base R and recommended packages R CMD CHECK provides no guarantee that an R add-on package meets its specifications Authors do not have to write tests 1.
DRAFT Richard Chandler-Mant – R Consultant ## WARNING: ## This code is a complete hack, ## may or may not work, etc.. ## Use your own risk. You have been warned. The recommended R package: ‘ codetools’, version Why validate R?
DRAFT Richard Chandler-Mant – R Consultant What is validation? “Establishing documented evidence which provides a high degree of assurance that a specific process will consistently produce a product, meeting its predetermined specifications and quality attributes” U.S. Food and Drug Administration (2013). 21 CFR Part 11: Electronic Records, Electronic Signatures,
DRAFT Richard Chandler-Mant – R Consultant The challenges of validating R Define Requirements Build Software Test Software Against Requirements Create Documentation Typical Software Development Process Investigation Remediation Certification
DRAFT Richard Chandler-Mant – R Consultant Challenge 1 : Defining requirements Packages don’t typically come with a list of requirements How do we determine the intended use? Package descriptions and help files Package vignettes Previous experience of using the packages
DRAFT Richard Chandler-Mant – R Consultant Challenge 2 : Understanding a package Software is already built Source code is available How do we gain an understanding of the package structure? The functionMap package
DRAFT Richard Chandler-Mant – R Consultant The functionMap package Describes the relationship of functions within an R package Parse R code Create dynamic graphics of function relationships Create a network object to show relationships Creates an interactive graphic See which functions a given function calls Determine functions that are called within each function
DRAFT Richard Chandler-Mant – R Consultant
DRAFT Richard Chandler-Mant – R Consultant
DRAFT Richard Chandler-Mant – R Consultant Challenge 3 : Testing the requirements Package authors don’t have to write tests How do we test the requirements? Writing specific unit tests for requirements Understand the level of testing with testCoverage
DRAFT Richard Chandler-Mant – R Consultant The testCoverage package Determines which code in a package is covered by unit tests Reads package code and adds tracepoints Runs package unit tests and marks points as hit Creates a full report of the code hit
DRAFT Richard Chandler-Mant – R Consultant
DRAFT Richard Chandler-Mant – R Consultant Challenge 4 : Validation documentation Creating consistent documentation for each R package Automatic report template generation knitr Audit trail
DRAFT Richard Chandler-Mant – R Consultant Challenge 5 : Deployment Installation Qualification (IQ) Operation Qualification (OQ) Performance Qualification (PQ)
DRAFT Richard Chandler-Mant – R Consultant IQ - Is everything installed as expected? Perform an audit of the installed files OQ - Does everything operate as expected? Run the tests used in validation locally PQ - Does everything perform as expected? Measure the performance of the installed software VALIDATION Qualification process
DRAFT Richard Chandler-Mant – R Consultant What is ValidR? R (currently 3.0.2) Core and Recommended Packages 60+ additional packages Validation Packages Validation Documentation Windows Installer Linux RPMs
DRAFT Richard Chandler-Mant – R Consultant What are the benefits? Industry Gives analysts the tools they want use Provides assurance to the IT department Reduces risk and cost to business managers Accelerated analytical cycle R community Bug fixes patches Increased user base
DRAFT Richard Chandler-Mant – R Consultant The ValidR roadmap September 2014 EARL Conference Release testCoverage and functionMap on CRAN Easter 2015 Release of validated R 3.1.x Summer 2015 Release of validated additional packages
DRAFT Richard Chandler-Mant – R Consultant What does validation mean Assurance, evidence Why validate R Provide assurance, evidence The challenges of validating R Validation tools ValidR Methodology, software, documentation Benefits Summary
DRAFT Richard Chandler-Mant – R Consultant Questions?