Presentation on theme: "Midwest User Network Richard Lewis Octagon Research Solutions"— Presentation transcript:
1 Midwest User Network Richard Lewis Octagon Research Solutions
2 Agenda 9:00-9:20 Group Update 9:20-9:30 Portal Update 9:30-10:30 Discussion Questions10:30-11:00 When to Integrate SDTM11:00-12:00 Changes from to 3.1.2
3 To existing members and (many) new members Welcome!To existing members and (many) new members
4 Next Meeting Presentations Vote on external presentations for next meeting.define.xml teamADaM teamISD Pilot teamControlled TerminologyOthers?Internal Presentation VolunteersImplementing ADaMOvercoming pitfalls
5 Lunch Meetings Successfully Used by Other User Groups Timing Once a monthTopic Out Ahead of TimeCasualDiscussion rather than presentationInterest outside the N IL area?List of possible restaurants
6 CDISC InterchangeOctober 28th – 30thArlington VA
11 Discussion Questions (Cathy) How are companies handling the release of new controlled terminology packages?Should the database match what is on the crf, or is it ok to map values that are equivalent?Are companies using the new LBTESTCD’s – example do you use “GLUC” for both serum glucose and urine glucose? Will we need to have separate testcds - GLUC and UGLUC in ADaM, based on the recent draft ADAMIG?
12 Discussion QuestionsAre companies using RELREC for PC and PP, and/or other domains? How does it get created?
13 Discussion QuestionsAre companies including Screen Failure data in SDTM (DM, DS, SC)?
14 Discussion QuestionsWhat methods are being used to control the quality of SDTM besides WebSDM?WebSDM checks (Sandy VanPelt Nguyen)Is the FDA using V1.5 or V2.6?CDISC Website points to V1.5
15 Discussion QuestionsWhat are companies doing for legacy conversions? Do they create:study SDTMstudy ADaMintegrated SDTMintegrated ADaM
16 Discussion QuestionsFor integrated ADaM, are only data sets for SCE/SCS needed? Most likely, yes? For example, there is no need to have ADIE, ADPE…
17 Discussion QuestionsAre SDTM aCRF’s needed for all studies, including those with screen shots only (i.e. EDC studies)? Usually, there are too many pages.
18 Discussion QuestionsFor submissions with CDISC data sets, do we still prepare patient profiles? (According to CDISC documents, it will not be needed.)
19 Discussion Questions (Susan) When reviewing the ADaM model I was wondering how one documents the Source/Computational Method in the define.xml when1) the *true* source of the collected data is an internal analysis dataset (not SDTM)2) the *true* source of the derived variable/parameter is an internal analysis dataset (not SDTM).Should the ADaM define.xml describe the data in terms of the SDTM, even if the SDTM isn't the source of the data? What does one do if they produce SDTM and ADaM from their internal analysis datasets? Do you define ADaM in terms of SDTM to maintain transparency between ADaM and SDTM? Do you need to provide the computational method in the define.xml if the variable/parameter comes from another dataset?
20 Discussion QuestionsFor reference here are some excerpts from the ADaM Document: 1.1 Purpose The Analysis Data Model describes key principles that apply to all analysis datasets, with the overall principle being that the design of analysis datasets and associated metadata facilitate explicit communication of the content, input, and purpose of submitted analysis datasets Definitions Input Data – The data used for the creation of analysis data sets. Traceability – The property that permits the user of an analysis dataset to understand the relationship of analysis values to the study tabulation datasets Introduction Analysis datasets should facilitate clear and unambiguous communication of the content of the datasets supporting the statistical analysis performed in a clinical study, should provide a level of traceability to allow an understanding of the relationship of analysis values to the input data, and ...
21 From JohnIn many respects, ADaM is still about philosophy and the idea that analysis dataset metadata should refer to the input data is logical.From the point of view of the recipient of the SDTM and ADaM data, it makes sense that ADaM refers to the study data tabulation in hand, rather than to input data that are not sent. Otherwise, I think it would be difficult for a recipient of ADaM to verify the derivation of the datasets or to perform sensitivity analyses.ADaM is a CDISC standard, and as such, I believe that ADaM metadata is about how you can get to ADaM from SDTM. That wasn't always recognized so clearly as now. This is partly the reason that there are some discrepancies perhaps between the documents/sections still.ADaM may be in fact generated from a different source than SDTM, however if so, I think the metadata should still refer to SDTM as if it had been the source; and this is difficult if SDTM is not really the source.CDISC now has a vision that CDISC metadata should be bidirectional and permit one to go from collected values through to analysis, and vice-versa. Implicit or explicit in this is that ADaM metadata refer to SDTM. The vision is CDASH-SDTM-ADaM-analysis (via ADaM results metadata).Another thing, besides data, some have said that "input" may also refer to SAP, Protocol, third party algorithms, thresholds, etc.
22 From SusanAs you are probably aware, this has been a central issue for many ADaM discussions. It is probably worth noting that in earlier versions of the ADaM standards, we did have SDTM as our ‘input’ box but some team members felt this was too restrictive and the oft cited adage that “CDISC can not endorse any particular process” was used to change the documentation to make it more generic. In addition, even in the Linear process (SDTM first, then ADaM), because of timing, there might be inputs to ADaM that are not yet in SDTM (PK spreadsheets come to mind, randomization schedules, protocol deviations, etc.). But we tried to reinforce the concepts in words that there must be some describable relationship between SDTM and ADaM, regardless of what the input to ADaM is. The way we approach this in the ADaM training is to emphasize that the FDA reviewer receives 1) the SDTM tabulation data and 2) ADaM data. They do not receive any ‘raw’ data. If their task is to understand what you did in the analysis, then it follows that they must understand this with the data they have in hand. If they have questions about how you created a derived observation in ADaM, they are going to be asking this relative to the observations in SDTM. You will be hard pressed to answer their questions if you do not understand the relationship between SDTM and ADaM.Creating ADaM from something other than SDTM is not impossible and in fact there are more than a couple of large pharma’s who are doing it this way. But it does add another layer of effort to create the trace between SDTM and ADaM.
23 Discussion: When to Integrate SDTM (Yen) Late Stage ConversionsData collected in ‘legacy’ formatSDTM created in final stagesAnalysis datasets created independently of SDTMCSR may be written
24 Discussion: When to Integrate SDTM (cont) Mid Stage ConversionsData collected in ‘legacy’ formatConverted to SDTM after collectionAnalysis datasets created from SDTMUpstreamData collected in SDTM (like) formatNo or minimal conversion necessary
25 When to Integrate SDTM? Pros & Cons at each stage Late-stage: Pro: Minimum disruption of business processPro: Fastest way to submit SDTMCon: Submitted data not source for analysisCon: Convert at time-critical point in project
26 When to Integrate SDTM? Mid-stage: Pro: Midrange disruption of business processPro: SDTM data is source for analysisPro: Efficient data exchange w vendors & partnersCon: Convert at time-critical point in project
27 When to Integrate SDTM? Upstream, in collection systems: Pro: Build SDTM, not convert to SDTMPro: Most efficient data exchange w vendors & partnersCon: Maximum disruption of business process
29 Scope of Review Not domain by domain review Review of changes in Section 4Changes each impact many domainsBasic SDTM knowledge independent of SDTM domainsAlthough I couldn’t resist adding a couple of domains which had major changes at the end
30 184.108.40.206 Order of the Variables Variable order no longer flexible 1) Identifiers2) Topic3) Qualifiers4) TimingWithin each role order should be the order shown in 220.127.116.11.5 of the SDTM
31 18.104.22.168 Additional Guidance on Dataset Naming Custom domains beginning with X, Y or Z are reservedWill not be used by SDTM in the futureSecond letter can be any letter or numberUsing X-, Y- or Z- is optional and not required
32 22.214.171.124 Splitting Domains Why sponsors will split is not addressed Two methodsGeneral observation classesSplit by –CAT, which must be populated in all casesFA DomainSplit by –CATSplit relative to parent domain of the value in –OBJFor example, FACM would store Findings About CM records.
33 126.96.36.199 Splitting Domains (cont) Other rules:1) Values in DOMAIN remain the same2) Domain prefixes use value in DOMAIN3) --SEQ unique within USUBJID across domains4) Variables with same name must have same length across datasets5) Permissible variables do not have to be in all of the datasets
34 188.8.131.52 Splitting Domains (cont) Other Rules: (cont)6) Up to 4 character dataset namesFirst two letters are the same as the original domain7) SUPPQUALs of split domains also splitSUPPQS36, SUPPFACM8) RELREC relationship defined for split FA domains may reference 4 character dataset name
36 184.108.40.206 Origin Metadata Origin Column of Define.xml Multiple Sources CRFeDTDerivedAssigneddetermined by individual judgment (by an evaluator other than the subject or investigator)Protocoldefined as part of the Trial Design preparationMultiple SourcesVariable-level metadata will list all types separated by commas, eg ‘Derived, CRF’Value-level metadata will show origin at test level
37 220.127.116.11 Assigning Natural Keys in the Metadata Defines ‘Natural Keys’Keys may include SUPPQUALSTUDYID, USUBJID, PEDTC, PETESTCD, PELOC, PEMETHOD, QNAM.PEMAKE, QNAM.PEMODELGeneric test codes rather than bunching
38 18.104.22.168 Use of “Subject” and USUBJID No two subjects can share the same USUBJIDConversely, every subject must retain the same USUBJID throughout the submission (if known)Format not specifiedSTUDY-SITE-SUBJID000001
39 22.214.171.124 Convention for Missing Values Missing values represented by nullsPreviously stated that convention used should be specified in the define file
40 126.96.36.199 Grouping Variables and Categorization STUDYIDDOMAIN--CAT--SCATUSUBJID--GRPID--REFID
41 188.8.131.52 Grouping Variables and Categorization (cont) --CAT/--SCATSubset groups within a domainKnown about the data before it is collectedGroup data across subjectsMay have controlled terminology
42 184.108.40.206 Grouping Variables and Categorization (cont) --GRPIDGroups data within a subjectHave no meaning across subjectsAssigned during or after data collectionSponsor defined, not controlled terminology--REFIDExample, sample identifier for blood sample
43 220.127.116.11 Submitting Free Text From the CRF ‘Specify’ values for non-result qualifiersWhen free-text information is collected to supplement a standard non-result qualifier, free-text value goes into SUPPQUAL.Reason for Dose AdjustmentDescribe___ Adverse Event [EXADJ]_[SUPPQUAL]____ Insufficient Response________________ Non-medical Reason
44 18.104.22.168 Submitting Free Text From the CRF (cont) ‘Specify’ values for non-result qualifiers (cont)Location of Injection: Other, Specify: ____Verbatim = UPPER RIGHT ABDOMENOption 1: EXLOC=OTHERSponsor maintains original CTVerbatim goes in SUPPQUALOption 2: EXLOC=ABDOMENSponsor has expanded CT based on their coding decision of the verbatim textOption 3: EXLOC = UPPER RIGHT ABDOMENSponsor does not care about CT for this variable
45 22.214.171.124 Submitting Free Text From the CRF ‘Specify’ values for result qualifiersEye Color: Other, Specify________Verbatim = BLUEISH GRAYOption 1:SCORRES = BLUEISH GRAYSCSTRESC = OTHERSponsor wishes to maintain CTOption 2:SCSTRESC = GRAYSponsor will expand CT based on their coding decisionOption 3:SCSTRESC = BLUEISH GRAYSponsor does not care about maintaining CT
46 126.96.36.199 Submitting Free Text From the CRF ‘Specify’ values for topic variableInterventionsAcetaminophenAspirinOther:______Verbatim will be entered into –TRTEventsVerbatim entered into –TERMFindingsVerbatim needs to be coded so that –TEST/--TESTCD are CT and not free text
47 188.8.131.52 Multiple Values for a Variable Topic variable (--TRT, --TERM)Assumed sponsor will split or resolve for their data management proceduresDS is an exceptionCovered inSponsor chooses primarySubmit others in SUPPQUAL
48 184.108.40.206 Multiple Values for a Variable (cont) Findings result variableSplit into 2 rowsEGORRES=ATRIAL FIBRILLATIONEGORRES=ATRIAL FLUTTERNon-result qualifier variableVariable value should be MULTIPLEIndividual values stored in SUPPQUALAETERM=RASH, AELOC = MULTIPLEQNAM.AELOC1 = FACEQNAM.AELOC2 = NECKQNAM.AELOC3 = CHESTUNLESSIf one is considered of primary interest, that value can go into the variable, with the others stored in SUPPQUALWill reviewer know these are in SUPPQUAL? Document!
49 4.1.3 Coding and Controlled Terminology Assumptions ‘*’ if no controlled terminology existsList of the terms if the list is not maintained elsewhereName of the external codelistFull CT Discussion to be held in the future
50 220.127.116.11 Use of Relative Timing Variables Introduction to the new SDTM variables--STRTPTExamples: " " or "VISIT 2".--STTPT--ENRTPT--ENTPTTimepoints are not anchored to RFSTDTC and RFENDTC as in --ENRF and --STRFValid values in –STTPT or –ENTPT are:BEFORECOINCIDENTAFTERU
51 18.104.22.168 Use of Relative Timing Variables - Example If an AE is known to be ongoing during at the end of a subject’s study participation, which is on October 17th, 2008 then:AEENRTPT = ONGOINGAEENTPT =
52 4.1.5 Other Assumptions--ORRES should generally not be populated for derived recordsStill not required but highly encouragedIf symbol is collected with original results, for example <10,000 then this gets copied into –STRESC, but --STRESN is nullAlso applies to values such as TRACE, 1+, etc.Discouraging derivations in SDTMRecommended that this be done in ADaM
53 4.1.5 Other Assumptions (cont) If --TEST (except for IETEST and TI.IETEST) values > 40 characters then --TEST should be:1st 40 charactersShortened but meaningful versionIn either case, if the full text is on the CRF, then link to that from the Origin column. If it is not on the CRF, then link to another PDF which contains the full test nameAlso applies to QLABEL in SUPPQUAL
54 4.1.5 Other Assumptions (cont) Clinical SignificanceShould all go to SUPPQUAL3.1.1 had EG examples with CS in the results field.--REAS standard QNAM for reason test was performed
55 4.1.5 Other Assumptions (cont) Introduction to the new SDTM variable –PRESPIndicates that an event or intervention was prespecified on the CRFValues are Y or nullSituation--PRESP--OCCUR--STATSpontaneously reported event occurredPre-specified event occurredYPre-specified event did not occurNPre-specified event has no responseNOT DONE
56 Domain ModelsNew assumptions with most tables listing what variables would generally not be added into the domainExamples moved from Section 9 to Section 6, under corresponding domain tableVariables dropped/addedVariable order changesLabel changesAssumptions added/clarified/dropped
57 DMMultiple race should be handled as multiple response for non-result qualifierAdditional race data now goes into SUPPDM, instead of SC
58 CONo longer restricts the addition of Identifiers and Timing variablesWhen not related to other domain records
59 SE / SVMoved from Trial Design to Special Purpose
60 EXAssumption that EX is required for all studies which include investigational productObserved by InvestigatorAutomated dispensing device recordsSubject Recall (eg via diary)Derived from DA (pill count)Derived from the protocol
61 AERemoved AEOCCURAE is only for AEs that actually occurred
62 CEClinical events of interest that would not be classified as adverse events
63 FA Not subclass of the findings domain Only domain that can use the –OBJ SDTM variablePreviously CF domain (3.1.2 draft)