1 © CDISC 2014 CDISC Pharmacogenomics Standards Joyce Hernandez (Joyce Hernandez Consulting, LLC) 1

2 © CDISC 2014 Agenda Project background Domains, Relationships & Molecular Concepts Variables Specimen Genealogy Specimen Hierarchy Pharmacogenomics (PGx) Examples:  Biospecimen events and findings  Genetic Variation  Gene Expression Next steps and team

3 © CDISC 2014 Background Initial Data Focus for Version 1.0  Specimen Collection and Handling  Specimen Hierarchy  Genetic Variation utilizing well-known standards (HGVS)  Genotyping data (common formats currently used)  Viral Genetics (includes some viral classification variables) Special sections to enhance understanding  Glossary of genetic and genomic terms  Nomenclatures (HGVS, HLA)  CMAPS to document common processes

4 © CDISC 2014 New Domains to support PGx 4

5 © CDISC 2014 Domain Relationships within a STUDYID 5

6 © CDISC 2014 Molecular concepts represented in the domains 6 1 3 2 4a 4b 4c 5 4

7 © CDISC 2014 PGx Specific Variables - Specimen 7 NameLabelNotes --REFID Reference IDSpecimen identifier. --PARENTSpecimen ParentWhen the specimen in question has been obtained from another specimen (e.g., via resectioning, aliquoting), --PARENT holds the ‑‑ REFID of the “parent specimen;” that is, the specimen from which the current specimen has been obtained. --SPCLVLSpecimen LevelAny specimen obtained directly from the subject has a specimen level of 1. Specimens obtained from a level 1 specimen have a specimen level of 2; from a level 2 specimen have a specimen level of 3; etc. A level 4 specimen, therefore, would be a specimen (4) obtained from a specimen (3) obtained from a specimen (2) obtained from a specimen (1) obtained from the subject. --DTCDate/Time CollectedDate/time of specimen collection. For specimens with a specimen level greater than 1, ‌ ‑‑ DTC refers specifically the date/time of collection for the originating specimen, i.e., the specimen obtained directly from the subject. A specimen is a sample of the subject which undergoes a test in place of the subject when the test cannot be performed on the subject directly, with the understanding that any results obtained thereby may be treated as pertaining to the subject. However, once the specimen has been separated from the subject, any changes in the subject’s state will not be reflected by the specimen. Therefore, when a test is performed on a specimen, the results cannot be guaranteed to pertain to the subject as they are at the time of the test, only to the subject as they were at the time of specimen collection. RELSPEC

8 © CDISC 2014 List of IG Use Cases BE/BS – Biospecimen Domains  Specimen handling such as freeze/thaw cycles and transportation.  Steps in obtaining cell-free RNA from blood plasma.  Types of quality evaluation. RELSPEC – Related Specimens  Specimen genealogy and hierarchy. PF – Pharmacogenomics Findings  Protein variation in viral genetics.  Protein and nucleic variation in viral genetics.  Frame shifts, both viral and subject.  Nucleotide reads.  Zygosity.  Single-nucleotide polymorphisms (SNP) reads.  HLA allelic records  Observed somatic vs. gremlins variations.  Observed levels of somatic variations in a biopsy sample.  Gene expression measured via qRT-PCR.  Gene expression measured via microarray. PG – PGx Methods and Supporting Information  Run parameters for PCR.  Details of SNP probe assays. PB/SB – PGx Marker Domains  Simple and complex genetic markers for drug resistance. Relating PGx Domains  A somatic variation and its related medical diagnosis.  Germline variations and related inherited risk of cancer.  Genetic variations relating to drug metabolism. 8

9 © CDISC 2014 Specimen Genealogy 9 RowSTUDYIDUSUBJIDREFID SPEC PARENTSPCLVL 1ABC-123001-01SPC-001TISSUE 1 2ABC-123001-01SPC-001-ATISSUESPC-0012 3ABC-123001-01SPC-001-BTISSUESPC-0012 4ABC-123001-01SPC-001-B-1DNASPC-001-B3 5ABC-123001-01SPC-003BRAIN 1 6ABC-123001-01SPC-003-ARNASPC-0032 RELSPEC

10 © CDISC 2014 Biospecimen Events and Findings 10 RowSTUDYIDDOMAINUSUBJID SPDEVID BESEQBEREFIDBETERMBEDECODBEPARTYBEPRTYIDBECATBESCAT 1ABC134BE43871 TS409871 11148.267ExcisionEXCISIONCOLLECTION SOFT TISSUE 2ABC134BE43871 21148.267Flash Frozen FLASH FROZEN PREP 3ABC134BE43871 309827 31148.267 Stored in Freezer STORED STORING 4ABC134BE43871 41148.267ThawTHAWPREP 5ABC134BE43871 LN43871 51148.267ShippedSHIPPEDABC LAB01TRANSPORT RowBEBODSYSBELOCVISITNUMVISIT BEDTC BESTDTCBEENDTC 1 (cont) Nervous System [A08] BRAIN1BASELINE 2005-03-202005-03- 20T15:07 2 (cont) Nervous System [A08] BRAIN1BASELINE 2005-03-202005-03- 20T15:07 2005-03- 20T13:22 3 (cont) Nervous System [A08] BRAIN1BASELINE 2005-03-202005-03- 20T13:22 2005-03- 21T10:29 4 (cont) Nervous System [A08] BRAIN1BASELINE 2005-03-202005-03- 21T10:29 2005-03- 21T10:36 5 (cont) Nervous System [A08] BRAIN1BASELINE 2005-03-202005-03- 21T11:00 2005-03- 21T15:00 RowBSSTRESUBSPECBSANTREG BSBLFLVISITNUMBSDTC 1 (cont)cm3BRAIN CEREBRAL AQUEDECT 2 (cont)CBRAIN CEREBRAL AQUEDECT Y12005-03-20 RowSTUDYIDDOMAINUSUBJIDBSSEQBSREFIDBSTESTCDBSTESTBSCATBSORRESBSORRESUBSSTRESCBSSTRESN 1ABC134BS438711 1148.267 VOLUMEVolume SPECIMEN MEASURE MENT 2cm322 2ABC134BS438712 1148.267 FFRZTMP Flash Frozen Temp SPECIMEN HANDLING -80C

11 © CDISC 2014 Variables – (Pathogens) 11 NameLabelNotes --MSPCES***Microorganism Species In findings domains, --SPCIES holds the species of the pathogen to which the subject is a host when the pathogen is the focus of the. In instances when both the subject and the pathogen are tested, records for the pathogen are distinguished and differentiated from records for the subject by the use of the --SPCIES variable. Not to be confused with DMSPCIES, which holds the species of the subject. --MSTRNMicroorganism Strain As --SPCIES. --STRAIN holds the strain of the pathogen to which the subject is a host when the pathogen is the focus of the test. *** SDTMIG omits --SPCIES because all subjects in most human clinical trials must be homo sapiens; the nature of the study obviates the need for this information to be included in SDTM datasets. The exception is Virology when a viral species must be identified.

12 © CDISC 2014 Variables – (Genetics/Genomics Test related) 12 NameLabelNotes --TESTTest Name For genetic variation, usually the level of granularity and/or molecular component of interest: Examples: Nucleotide, Amino Acid, Allele --REFSEQReference SequenceDepending on the type of test method, the reference sequencing is most likely to be either the rsID from dbSNP (for targeted tests) or a GenBank accession number (for non-targeted tests). --GENTYPType of Genetic Region of Interest The type the portion of the genome serving as a locus for the experiment/test. Examples: GENE, SECTOR, PROTEIN --GENRIGenetic Region of InterestThe portion of the genome serving as a locus for the experiment/test. Often the name of a gene. Examples: EGFR, KRAS, CYP2D6 --GENLIGenetic Location of Interest The numeric position within the sequence for the targeted read. Compare vs. --GENLOC. --GENLI and --GENTGT are variables that should be used only when the the test specifies a single genetic read to the exclusion of all other possibilities, and the result is a matter of occurrence, either as a percentage or as a boolean observation. --GENTGTGenetic TargetThe genetic read targeted by the probe at the position specified by --GENLI. --ALLELCAlleleHumans are diploid: they have two homologous copies of each chromosome. However, the two copies are not necessarily identical, since one chromosome is inherited from each parent. Therefore, in tests that compare chromosomes, or parts of chromosomes (alleles), the --ALLELE variable is used to denote results for one or the other of the two alleles (chromosomes).

13 © CDISC 2014 Variables – (Genetics/Genomics Result related) 13 NameLabelNotes --GENSRGenetic Sub-RegionThe sub-region within the genetic region of interest in which the observed varition at the position given in --GENLOC is located, if relevant. Because exon numbers can be variable and are not regulated, caution should be exercised when populating this variable. --GENLOCGenetic LocationOne of the three variables used to define a genetic read. --GENLOC holds the numeric position within the sequence for the observed result. --ORRESResult or Finding in Original Units One of the three variables used to define a genetic read. --ORRES holds the observed result at the position specified by --GENLOC. When --GENLI is populated, --ORRES follows the standard rules. --ORREFReference ResultOne of the three variables used to define a genetic read. --ORREF holds the expected result at the position specified by --GENLOC according to the reference sequence specified by --REFSEQ. --STRESCResult or Finding in Standard Format When --GENLOC is populated, --STRESC holds the observed variation, given in HGVS nomenclature. When --GENLI is populated and --ORRES=Y, --STRESC holds the observed variation as targeted, given in HGVS nomenclature. Otherwise, --STRESC is copied or derived from --ORRES. --RSNUMReference SNPReference identifier for previously identified instances of the variation, such as the rs# in dbSNP. --MUTYPMutation TypeThe type of mutation, usually either GERMLINE (inherited) or SOMATIC (arising only in parts of the individual, as in cancer). --ANMETHAnalysis Method Analysis method applied to obtain a summarized result. Analysis method describes the method of secondary processing applied to a complex observation result (e.g. an image or a genetic sequence).

14 © CDISC 2014 14 Genetic Variation Example RowSTUDYIDDOMAINUSUBJIDPGSEQPGTESTCDPGTESTPGGENTYPPGGENRIPGCATPGORRESPGSTRESC 1ABC-01234PG17C01541EXON Exons Sequenced GENEEGFR GENETIC VARIATION 13-21 2ABC-01234PG17C01542SEQSTART Sequence Start GENEEGFR GENETIC VARITATION 1499 3ABC-01234PG17C01543SEQLONG Sequence Length GENEEGFR GENETIC VARITATION 1127 Row STUDYIDDOMAINUSUBJIDPFSEQPFREFIDPFTESTCDPFTESTPFGENRIPFREFSEQPFCATPFORRESPFORREFPFGENLOC 1 ABX-01256PFXX7-15415493283NUCNucleotideEGFRNM_005228.3 GENETIC VARIATION CG2156 2 ABX-01256PFXX7-21218970343NUCNucleotideEGFRNM_005228.3 GENETIC VARIATION TC2369 3 ABX-01256PFXX7-22017629230NUCNucleotideEGFRNM_005228.3 GENETIC VARIATION TA2073 Row PFGENSR PFSTRESCPFXNAMPFNAMPFMETHODPFRUNIDVISITNUMPFDTC 1 (cont) Exon 18c.2156G>C 5.23.445.1.4.1650 08.1.8:86175 Biotech ABC Massively Parallel Sequencing 89707231 2012-10- 23T10:06 2 (cont) Exon 20c.2369C>T 5.23.445.1.4.1650 08.1.8:87952 Biotech ABC Massively Parallel Sequencing 89250001 2012-10- 23T12:50 3 (cont) Exon 16c.2073A>T 5.23.445.1.4.1650 08.1.8:87970 Biotech ABC Massively Parallel Sequencing 89250181 2012-10- 23T13:03 RowSTUDYIDDOMAINPBSEQPBMRKRIDPBGENTYPPBGENRIPBDRUGPGDIAGPBMRKRPBSTMT 1ABC-01234PB12073A>TGENEEGFR Astrocytoma2073A>T Decreased risk of diffusely infiltrating astrocytoma 2ABC-01234PB2G719AGENEEGFREGFR TKIs G719AIncreased sensitivity 3ABC-01234PB3T790MGENEEGFREGFR TKIs T790MDecreased sensitivity RowSTUDYIDDOMAINUSUBJIDSBSEQSBREFIDSBMRKRIDSBGENTYPSBGENRISBNAMVISITNUMSBDTC 1ABC-01234SB17C015415493283G719AGENEEGFRBiotech ABC12012-10-23T10:06 2ABC-01234SB17C021218970343T790MGENEEGFRBiotech ABC12012-10-23T10:06 3ABC-01234SB17C0220176292302073A>TGENEEGFRBiotech ABC12012-10-23T10:06

15 © CDISC 2014 15 Gene Expression Example – Arrays RowSTUDYIDDOMAINUSUBJIDSPDEVIDPFSEQPFGRPIDPFREFIDPFTESTCDPFTESTPFCATPFORRES 1A12345PF43871 AGS- G4900DA 212287.09443NINT1VALNormalized Intensity 1 ValueAnalytic1.16279 2A12345PF43871 AGS- G4900DA 312287.09443NINT2VALNormalized Intensity 2 ValueAnalytic0.96469 3A12345PF43871MANAN03412287.09443PVALP ValuePost-Analytic0.05391 4A12345PF43871MANAN03512287.09443FOLDCHGFold ChangePost-Analytic1.8 RowPFSTRESCPFSTRESNPFXFNPFNAMPFSPECPFMETHODPFRUNID PFANMETHPFBLFLVISITNUMPFDTC 1 (cont)1.16279 64.3.4:7280912 Deluxe Central Labs RNA Microarray 1000450001 LOWESS 22005-03- 21T11:28:17 2 (cont)0.96469 64.3.4:7280912 Deluxe Central Labs RNA Microarray 1000450001 LOWESS 22005-03- 21T11:28:17 3 (cont)0.05391 64.3.4:7280912 Deluxe Central Labs RNA Microarray 1000450001 22005-03- 21T11:28:17 4 (cont)1.8 64.3.4:7280912 Deluxe Central Labs RNAMicroarray1000450001 22005-03- 21T11:28:17 RowSTUDYIDDOMAINSPDEVIDDISEQDIPARMCDDIPARMDIVAL 1A12345DI AGM-G4851B 1TYPEDevice TypeMicroarray Kit 2 A12345 DI AGM-G4851B 2MANUFManufacturerAgilent 3 A12345 DI AGM-G4851B 3MODELModelG4851B 4A12345DI AGS- G4900DA 1TYPEDevice TypeMicroarray Scanner 5A12345DI AGS- G4900DA 2MANUFManufacturerAgilent 6A12345DI AGS- G4900DA 3MODELModelG4900DA 7A12345DI MANAN03 1TYPEDevice TypeWorkstation

16 © CDISC 2014 Next Steps Currently under CDISC internal review Public Review Posting – 2 nd Quarter Final Posting – 3 rd Quarter Next Project – 4 th Quarter - Cytogenetics 16

17 © CDISC 2014 Contact Information and Team NameCompany Joyce Hernandez, Team LeaderJoyce Hernandez Consulting Mohtaram BahmanianImClone Sally CassalsIndependent Consultant Rhonda FacileCDISC Doris LiImClone Cliona MolonyMerck Mona OakesImClone Phil PochonCovance Janet ReichAmgen Ellen SchatzEli Lilly James SullivanVertex Richard TyhachEli Lilly Patricia WesolowskiVertex Diane WoldGSK Darcy WoldIndependent Consultant Fred WoodAccenture Helena SviglinFDA Liaison Patrick HarringtonFDA Joy LiFDA 17 Anyone that wishes to join the team please contact Joyce:

