Presentation on theme: "Richard Lewis Octagon Research Solutions 2008-10-17 Midwest User Network."— Presentation transcript:
Richard Lewis Octagon Research Solutions Midwest User Network
2 Agenda 9:00-9:20 Group Update 9:20-9:30 Portal Update 9:30-10:30 Discussion Questions 10:30-11:00 When to Integrate SDTM 11:00-12:00 Changes from to 3.1.2
3 Welcome! To existing members and (many) new members
4 Next Meeting Presentations Vote on external presentations for next meeting. –define.xml team –ADaM team –ISD Pilot team –Controlled Terminology –Others? Internal Presentation Volunteers –Implementing ADaM Overcoming pitfalls
5 Lunch Meetings Successfully Used by Other User Groups Timing –Once a month Topic Out Ahead of Time Casual –Discussion rather than presentation Interest outside the N IL area? List of possible restaurants
6 CDISC Interchange October 28 th – 30 th Arlington VA
7 Portal Update CDISC Portal (Link from CDISC Website) SDSADaM Global User Network AsiaEUNorth America Bay-AreaMidwest
8 Portal Update
9 Now Midwest (Chicago) Adding Final Touches –Up and Fully Running Soon With the caveat that we have been saying that since 2006
10 Portal Questions?
11 Discussion Questions (Cathy) How are companies handling the release of new controlled terminology packages? Should the database match what is on the crf, or is it ok to map values that are equivalent? Are companies using the new LBTESTCDs – example do you use GLUC for both serum glucose and urine glucose? Will we need to have separate testcds - GLUC and UGLUC in ADaM, based on the recent draft ADAMIG?
12 Discussion Questions Are companies using RELREC for PC and PP, and/or other domains? How does it get created?
13 Discussion Questions Are companies including Screen Failure data in SDTM (DM, DS, SC)?
14 Discussion Questions What methods are being used to control the quality of SDTM besides WebSDM? WebSDM checks (Sandy VanPelt Nguyen) Is the FDA using V1.5 or V2.6? CDISC Website points to V1.5
15 Discussion Questions What are companies doing for legacy conversions? Do they create: –study SDTM –study ADaM –integrated SDTM –integrated ADaM
16 Discussion Questions For integrated ADaM, are only data sets for SCE/SCS needed? Most likely, yes? For example, there is no need to have ADIE, ADPE…
17 Discussion Questions Are SDTM aCRFs needed for all studies, including those with screen shots only (i.e. EDC studies)? Usually, there are too many pages.
18 Discussion Questions For submissions with CDISC data sets, do we still prepare patient profiles? (According to CDISC documents, it will not be needed.)
19 Discussion Questions (Susan) When reviewing the ADaM model I was wondering how one documents the Source/Computational Method in the define.xml when 1) the *true* source of the collected data is an internal analysis dataset (not SDTM) 2) the *true* source of the derived variable/parameter is an internal analysis dataset (not SDTM). Should the ADaM define.xml describe the data in terms of the SDTM, even if the SDTM isn't the source of the data? What does one do if they produce SDTM and ADaM from their internal analysis datasets? Do you define ADaM in terms of SDTM to maintain transparency between ADaM and SDTM? Do you need to provide the computational method in the define.xml if the variable/parameter comes from another dataset?
20 Discussion Questions For reference here are some excerpts from the ADaM Document: 1.1 Purpose The Analysis Data Model describes key principles that apply to all analysis datasets, with the overall principle being that the design of analysis datasets and associated metadata facilitate explicit communication of the content, input, and purpose of submitted analysis datasets. 1.4 Definitions Input Data – The data used for the creation of analysis data sets. Traceability – The property that permits the user of an analysis dataset to understand the relationship of analysis values to the study tabulation datasets. 2.1 Introduction Analysis datasets should facilitate clear and unambiguous communication of the content of the datasets supporting the statistical analysis performed in a clinical study, should provide a level of traceability to allow an understanding of the relationship of analysis values to the input data, and...
21 From John In many respects, ADaM is still about philosophy and the idea that analysis dataset metadata should refer to the input data is logical. From the point of view of the recipient of the SDTM and ADaM data, it makes sense that ADaM refers to the study data tabulation in hand, rather than to input data that are not sent. Otherwise, I think it would be difficult for a recipient of ADaM to verify the derivation of the datasets or to perform sensitivity analyses. ADaM is a CDISC standard, and as such, I believe that ADaM metadata is about how you can get to ADaM from SDTM. That wasn't always recognized so clearly as now. This is partly the reason that there are some discrepancies perhaps between the documents/sections still. ADaM may be in fact generated from a different source than SDTM, however if so, I think the metadata should still refer to SDTM as if it had been the source; and this is difficult if SDTM is not really the source. CDISC now has a vision that CDISC metadata should be bidirectional and permit one to go from collected values through to analysis, and vice-versa. Implicit or explicit in this is that ADaM metadata refer to SDTM. The vision is CDASH-SDTM-ADaM-analysis (via ADaM results metadata). Another thing, besides data, some have said that "input" may also refer to SAP, Protocol, third party algorithms, thresholds, etc.
22 From Susan As you are probably aware, this has been a central issue for many ADaM discussions. It is probably worth noting that in earlier versions of the ADaM standards, we did have SDTM as our input box but some team members felt this was too restrictive and the oft cited adage that CDISC can not endorse any particular process was used to change the documentation to make it more generic. In addition, even in the Linear process (SDTM first, then ADaM), because of timing, there might be inputs to ADaM that are not yet in SDTM (PK spreadsheets come to mind, randomization schedules, protocol deviations, etc.). But we tried to reinforce the concepts in words that there must be some describable relationship between SDTM and ADaM, regardless of what the input to ADaM is. The way we approach this in the ADaM training is to emphasize that the FDA reviewer receives 1) the SDTM tabulation data and 2) ADaM data. They do not receive any raw data. If their task is to understand what you did in the analysis, then it follows that they must understand this with the data they have in hand. If they have questions about how you created a derived observation in ADaM, they are going to be asking this relative to the observations in SDTM. You will be hard pressed to answer their questions if you do not understand the relationship between SDTM and ADaM. Creating ADaM from something other than SDTM is not impossible and in fact there are more than a couple of large pharmas who are doing it this way. But it does add another layer of effort to create the trace between SDTM and ADaM.
23 Discussion: When to Integrate SDTM (Yen) Late Stage Conversions –Data collected in legacy format –SDTM created in final stages –Analysis datasets created independently of SDTM –CSR may be written
24 Discussion: When to Integrate SDTM (cont) Mid Stage Conversions –Data collected in legacy format –Converted to SDTM after collection –Analysis datasets created from SDTM Upstream –Data collected in SDTM (like) format –No or minimal conversion necessary
25 When to Integrate SDTM? Pros & Cons at each stage Late-stage: Pro: Minimum disruption of business process Pro: Fastest way to submit SDTM Con: Submitted data not source for analysis Con: Convert at time-critical point in project
26 When to Integrate SDTM? Mid-stage: Pro: Midrange disruption of business process Pro: SDTM data is source for analysis Pro: Efficient data exchange w vendors & partners Con: Convert at time-critical point in project
27 When to Integrate SDTM? Upstream, in collection systems: Pro: Build SDTM, not convert to SDTM Pro: Most efficient data exchange w vendors & partners Con: Maximum disruption of business process
28 Changes from SDTMIG to 3.1.2
29 Scope of Review Not domain by domain review Review of changes in Section 4 –Changes each impact many domains –Basic SDTM knowledge independent of SDTM domains –Although I couldnt resist adding a couple of domains which had major changes at the end
Order of the Variables Variable order no longer flexible 1) Identifiers 2) Topic 3) Qualifiers 4) Timing –Within each role order should be the order shown in of the SDTM
Additional Guidance on Dataset Naming Custom domains beginning with X, Y or Z are reserved –Will not be used by SDTM in the future –Second letter can be any letter or number –Using X-, Y- or Z- is optional and not required
Splitting Domains Why sponsors will split is not addressed Two methods –General observation classes Split by –CAT, which must be populated in all cases –FA Domain Split by –CAT Split relative to parent domain of the value in –OBJ –For example, FACM would store Findings About CM records.
Splitting Domains (cont) Other rules: 1) Values in DOMAIN remain the same 2) Domain prefixes use value in DOMAIN 3) --SEQ unique within USUBJID across domains 4) Variables with same name must have same length across datasets 5) Permissible variables do not have to be in all of the datasets
Splitting Domains (cont) Other Rules: (cont) 6) Up to 4 character dataset names First two letters are the same as the original domain 7) SUPPQUALs of split domains also split SUPPQS36, SUPPFACM 8) RELREC relationship defined for split FA domains may reference 4 character dataset name
35 Splitting Domains - Sample
Origin Metadata Origin Column of Define.xml –CRF –eDT –Derived –Assigned determined by individual judgment (by an evaluator other than the subject or investigator) –Protocol defined as part of the Trial Design preparation Multiple Sources –Variable-level metadata will list all types separated by commas, eg Derived, CRF –Value-level metadata will show origin at test level
Assigning Natural Keys in the Metadata Defines Natural Keys Keys may include SUPPQUAL –STUDYID, USUBJID, PEDTC, PETESTCD, PELOC, PEMETHOD, QNAM.PEMAKE, QNAM.PEMODEL Generic test codes rather than bunching
Use of Subject and USUBJID No two subjects can share the same USUBJID Conversely, every subject must retain the same USUBJID throughout the submission (if known) Format not specified –STUDY-SITE-SUBJID –000001
Convention for Missing Values Missing values represented by nulls Previously stated that convention used should be specified in the define file
Grouping Variables and Categorization (cont) --CAT/--SCAT –Subset groups within a domain –Known about the data before it is collected –Group data across subjects –May have controlled terminology
Grouping Variables and Categorization (cont) --GRPID –Groups data within a subject –Have no meaning across subjects –Assigned during or after data collection –Sponsor defined, not controlled terminology --REFID –Groups data within a subject –Example, sample identifier for blood sample
Submitting Free Text From the CRF Specify values for non-result qualifiers –When free-text information is collected to supplement a standard non-result qualifier, free-text value goes into SUPPQUAL. Reason for Dose AdjustmentDescribe ___ Adverse Event [EXADJ]_[SUPPQUAL]_ ___ Insufficient Response_____________ ___ Non-medical Reason_____________
Submitting Free Text From the CRF (cont) Specify values for non-result qualifiers (cont) –Location of Injection: Other, Specify: ____ Verbatim = UPPER RIGHT ABDOMEN Option 1: EXLOC=OTHER –Sponsor maintains original CT –Verbatim goes in SUPPQUAL Option 2: EXLOC=ABDOMEN –Sponsor has expanded CT based on their coding decision of the verbatim text –Verbatim goes in SUPPQUAL Option 3: EXLOC = UPPER RIGHT ABDOMEN –Sponsor does not care about CT for this variable
Submitting Free Text From the CRF Specify values for result qualifiers –Eye Color: Other, Specify________ Verbatim = BLUEISH GRAY Option 1: –SCORRES = BLUEISH GRAY –SCSTRESC = OTHER –Sponsor wishes to maintain CT Option 2: –SCORRES = BLUEISH GRAY –SCSTRESC = GRAY –Sponsor will expand CT based on their coding decision Option 3: –SCORRES = BLUEISH GRAY –SCSTRESC = BLUEISH GRAY –Sponsor does not care about maintaining CT
Submitting Free Text From the CRF Specify values for topic variable –Interventions Acetaminophen Aspirin Other:______ Verbatim will be entered into –TRT –Events Verbatim entered into –TERM –Findings Verbatim needs to be coded so that –TEST/-- TESTCD are CT and not free text
Multiple Values for a Variable Topic variable (--TRT, --TERM) –Assumed sponsor will split or resolve for their data management procedures –DS is an exception Covered in Sponsor chooses primary Submit others in SUPPQUAL
Multiple Values for a Variable (cont) Findings result variable –Split into 2 rows EGORRES=ATRIAL FIBRILLATION EGORRES=ATRIAL FLUTTER Non-result qualifier variable –Variable value should be MULTIPLE –Individual values stored in SUPPQUAL AETERM=RASH, AELOC = MULTIPLE QNAM.AELOC1 = FACE QNAM.AELOC2 = NECK QNAM.AELOC3 = CHEST –UNLESS If one is considered of primary interest, that value can go into the variable, with the others stored in SUPPQUAL –Will reviewer know these are in SUPPQUAL? Document!
Coding and Controlled Terminology Assumptions * if no controlled terminology exists List of the terms if the list is not maintained elsewhere Name of the external codelist –http://www.cancer.gov/cancertopics/terminologyresou rces/CDISC Full CT Discussion to be held in the future
Use of Relative Timing Variables Introduction to the new SDTM variables 1)--STRTPT –Examples: " " or "VISIT 2". 2)--STTPT 3)--ENRTPT 4)--ENTPT –Timepoints are not anchored to RFSTDTC and RFENDTC as in --ENRF and --STRF –Valid values in –STTPT or –ENTPT are: BEFORE COINCIDENT AFTER U
Use of Relative Timing Variables - Example If an AE is known to be ongoing during at the end of a subjects study participation, which is on October 17 th, 2008 then: –AEENRTPT = ONGOING –AEENTPT =
Other Assumptions --ORRES should generally not be populated for derived records –Still not required but highly encouraged If symbol is collected with original results, for example <10,000 then this gets copied into – STRESC, but --STRESN is null –Also applies to values such as TRACE, 1+, etc. –Discouraging derivations in SDTM –Recommended that this be done in ADaM
Other Assumptions (cont) If --TEST (except for IETEST and TI.IETEST) values > 40 characters then --TEST should be: –1 st 40 characters –Shortened but meaningful version –In either case, if the full text is on the CRF, then link to that from the Origin column. If it is not on the CRF, then link to another PDF which contains the full test name –Also applies to QLABEL in SUPPQUAL
Other Assumptions (cont) Clinical Significance –Should all go to SUPPQUAL –3.1.1 had EG examples with CS in the results field. --REAS standard QNAM for reason test was performed
Other Assumptions (cont) Introduction to the new SDTM variable –PRESP –Indicates that an event or intervention was prespecified on the CRF –Values are Y or null Situation--PRESP--OCCUR--STAT Spontaneously reported event occurred Pre-specified event occurredYY Pre-specified event did not occurYN Pre-specified event has no responseYNOT DONE
56 Domain Models New assumptions with most tables listing what variables would generally not be added into the domain Examples moved from Section 9 to Section 6, under corresponding domain table Variables dropped/added Variable order changes Label changes Assumptions added/clarified/dropped
57 DM Multiple race should be handled as multiple response for non-result qualifier Additional race data now goes into SUPPDM, instead of SC
58 CO No longer restricts the addition of Identifiers and Timing variables –When not related to other domain records
59 SE / SV Moved from Trial Design to Special Purpose
60 EX Assumption that EX is required for all studies which include investigational product –Observed by Investigator –Automated dispensing device records –Subject Recall (eg via diary) –Derived from DA (pill count) –Derived from the protocol
61 AE Removed AEOCCUR AE is only for AEs that actually occurred
62 CE Clinical events of interest that would not be classified as adverse events
63 FA Not subclass of the findings domain Only domain that can use the –OBJ SDTM variable Previously CF domain (3.1.2 draft)