Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interpreting of Patient-Reported Outcomes Joseph C. Cappelleri, PhD, MPH Pfizer Inc (

Similar presentations


Presentation on theme: "Interpreting of Patient-Reported Outcomes Joseph C. Cappelleri, PhD, MPH Pfizer Inc ("— Presentation transcript:

1 Interpreting of Patient-Reported Outcomes Joseph C. Cappelleri, PhD, MPH Pfizer Inc ( Presentation at the Meeting of the New Jersey Chapter of the American Statistical Association, Bridgewater, New Jersey, October 18, 2013

2 Disclaimer The views expressed here do not reflect the views of Pfizer Inc

3 Outline: Learning Objectives Part 1: To understand the methods for interpretation of patient-reported outcomes for label and promotional claims Part 1: To understand the methods for interpretation of patient-reported outcomes for label and promotional claims Part 2: To move beyond the 2009 FDA guidance – extended approaches Part 2: To move beyond the 2009 FDA guidance – extended approaches

4 Part 1: Interpretation of Patient- Reported Outcomes for Label and Promotional Claims

5 Introduction Key is to focus on prespecified patient- reported outcome (PRO) Key is to focus on prespecified patient- reported outcome (PRO) Important to report all prespecified PROs (not just those that are “significant”) Important to report all prespecified PROs (not just those that are “significant”) Also important to report all PROs, prespecified or not Also important to report all PROs, prespecified or not

6

7 Interpretation of PROs Not considered a measurement property Not considered a measurement property Interpretation of PRO endpoints follows similar considerations as for all other endpoint types used to evaluate treatment benefit of a medical product Interpretation of PRO endpoints follows similar considerations as for all other endpoint types used to evaluate treatment benefit of a medical product This presentation assumes adequate evidence of instrument development and validation on the PRO measure of interest This presentation assumes adequate evidence of instrument development and validation on the PRO measure of interest

8 PRO Guidance - Interpretation of Data

9 What is Different About PROs?

10 3. Unknown relevant change standards for PRO measures research tools little training or patient experiences 2. Developed familiarity with physiological measures professional training experience gained by observing changes among many patients 1. Known relevant change standards for physiological measures blood pressure serum creatinine level

11 11 Need to achieve statistically significant differences between the active treatment and placebo arms for clinical trials, but it’s just not enough Need to achieve statistically significant differences between the active treatment and placebo arms for clinical trials, but it’s just not enough Need a way to determine if statistically significant differences are meaningful and important to clinical trial participants Need a way to determine if statistically significant differences are meaningful and important to clinical trial participants Can’t rely on p<0.05 to demonstrate an interpretable difference —Many PRO scales are new to label readers and familiarity with what types of changes are important requires experience over time Can’t rely on p<0.05 to demonstrate an interpretable difference —Many PRO scales are new to label readers and familiarity with what types of changes are important requires experience over time Interpretation Is More Than p<0.05

12 12 How do you determine the Responder Definition for a PRO instrument?

13 Key to Interpretation: Responder Definition Defined as the trial-specific important difference standard or threshold applied at the individual level of analysis Defined as the trial-specific important difference standard or threshold applied at the individual level of analysis This represents the individual patient PRO score change over a predetermined time period that should be interpreted as a treatment benefit This represents the individual patient PRO score change over a predetermined time period that should be interpreted as a treatment benefit

14 Responder Definition Responder Definition The responder definition is determined empirically and may vary by target population or other clinical trial design characteristics The responder definition is determined empirically and may vary by target population or other clinical trial design characteristics FDA reviewers will evaluate a PRO instrument’s responder definition in the context of each specific clinical trial FDA reviewers will evaluate a PRO instrument’s responder definition in the context of each specific clinical trial

15 Anchor-Based Methods Anchor-Based Methods Anchor-based methods explore the associations between the targeted concept of the PRO instrument and the concept measured by the anchors Anchor-based methods explore the associations between the targeted concept of the PRO instrument and the concept measured by the anchors To be useful, the anchors chosen should be easier to interpret than the PRO measure itself and should bear an appreciable correlation with it To be useful, the anchors chosen should be easier to interpret than the PRO measure itself and should bear an appreciable correlation with it

16 Example of Responder Definition: Pain Intensity Numerical Rating Scale (PI-NRS) Farrar JT et al. Pain 2001; 94: Farrar JT et al. Pain 2001; 94: point pain scale: 0 = no pain to 10 = worst pain 11-point pain scale: 0 = no pain to 10 = worst pain Baseline score = mean of 7 diary entries prior to drug Baseline score = mean of 7 diary entries prior to drug Endpoint score = mean of last 7 diary entries Endpoint score = mean of last 7 diary entries Interest centers on change score Interest centers on change score Primary endpoint in pregabalin program Primary endpoint in pregabalin program 10 chronic pain studies with 2724 subjects 10 chronic pain studies with 2724 subjects Placebo-controlled trials of pregabalin Placebo-controlled trials of pregabalin Several conditions (e.g., fibromyalgia and osteoarthritis) Several conditions (e.g., fibromyalgia and osteoarthritis)

17 Example of Responder Definition: Pain Intensity Numerical Rating Scale (PI-NRS) Patient Global Impression of Change (anchor) Patient Global Impression of Change (anchor) Clinical improvement of interest Clinical improvement of interest Best change score for distinguishing ‘much improved’ or better on PGIC Best change score for distinguishing ‘much improved’ or better on PGIC Since the start of the study, my overall status is: Since the start of the study, my overall status is: 1.Very much improved 2.Much improved 3.Minimally improved 4.No change 5.Minimally worse 6.Much worse 7.Very much worse

18 Receiver operating characteristic curve Receiver operating characteristic curve Favorable: much or very much improved Favorable: much or very much improved Not favorable: otherwise Not favorable: otherwise ~30% reduction on PI-NRS ~30% reduction on PI-NRS Sensitivity = 78% and specificity = 78% Sensitivity = 78% and specificity = 78% Area under curve = 86% Area under curve = 86% Example of Responder Definition: Pain Intensity Numerical Rating Scale (PI-NRS)

19

20 Types of Anchors Clinical measure Clinical measure a 50% reduction in incontinence episodes might be proposed as the anchor for defining a responder a 50% reduction in incontinence episodes might be proposed as the anchor for defining a responder Clinician-reported outcome Clinician-reported outcome Clinician global rating of change (CGIC) in mental health conditions Clinician global rating of change (CGIC) in mental health conditions Patient global ratings Patient global ratings –Patient global rating of change –Patient global rating of concept

21 21 Cumulative Distribution Function Cumulative Distribution Function

22 Cumulative Distribution Function An alternative or supplement to responder analysis An alternative or supplement to responder analysis Display a continuous plot of the percent change (or absolute change) from baseline on the horizontal axis and the cumulative percent of patients experiencing up to that change on the vertical axis Display a continuous plot of the percent change (or absolute change) from baseline on the horizontal axis and the cumulative percent of patients experiencing up to that change on the vertical axis Such a cumulative distribution of response curve – one for each treatment group – would allow a variety of response thresholds to be examined simultaneously and collectively, encompassing all available data Such a cumulative distribution of response curve – one for each treatment group – would allow a variety of response thresholds to be examined simultaneously and collectively, encompassing all available data

23 Illustrative Cumulative Distribution Function: Experimental Treatment (solid line) better than Control Treatment (dash line) -- Negative changes indicate improvement

24 CDF results that do not demonstrate the comparative efficacy of Drug A or Drug B

25 Better result for demonstrating the efficacy of Drug A over Drug B

26 Aricept ® label from 10/13/2006

27 Cymbalta ® label from 11/19/2009 (x-axis reversed)

28 Part 2: Moving Beyond the FDA Guidance – Extended Approaches Part 2: Moving Beyond the FDA Guidance – Extended Approaches

29 Moving Beyond the 2009 PRO Guidance: Cumulative Proportions and Responder Analysis

30 Cumulative Proportion of Responders Analysis Variation of cumulative distribution function analysis and considers only subjects with improvement scores Variation of cumulative distribution function analysis and considers only subjects with improvement scores Cumulative proportion of patients who achieved a specific response rate (percentage) or better as improvement from baseline Cumulative proportion of patients who achieved a specific response rate (percentage) or better as improvement from baseline Such descriptive (cumulative) response profiles can be numerically enriched using area under the curve Such descriptive (cumulative) response profiles can be numerically enriched using area under the curve

31 Farrar et al. Journal of Pain and Symptom Management 2006; 31: Farrar et al. Journal of Pain and Symptom Management 2006; 31: Randomized, double-blind trial of patients with postherpetic neuralgia treated with pregabalin over an eight-week period Randomized, double-blind trial of patients with postherpetic neuralgia treated with pregabalin over an eight-week period Outcome: 11-point pain intensity rating scale (0 = no pain to 10 = worst possible pain) Outcome: 11-point pain intensity rating scale (0 = no pain to 10 = worst possible pain) Example: Cumulative Proportion of Responders Analysis

32 Example: Cumulative Proportion of Responders Analysis (CPRA) The proportions of patients with at least 30% decreases in mean pain scores were greater with pregabalin than with placebo (63% vs. 25%, P = 0.001)

33 Moving Beyond the 2009 PRO Guidance: Reference-group Interpretation: Variation of Anchor-based Approach

34 Reference-Group Interpretation Compare trial-based values with values from a reference (anchor) group Compare trial-based values with values from a reference (anchor) group Reference values can come from a general population or healthy population Reference values can come from a general population or healthy population

35 Example of Reference-Group Interpretation: Self-Esteem And Relationship (SEAR) Questionnaire Consider 93 men with erectile dysfunction who were measured before and after treatment with sildenafil Add independent study of men with no clinical diagnosis of ED in past year (control sample) Relationship with a partner 94 control subjects received no treatment SEAR assessments completed at a single visit

36 Traditional Statistical Test Significant Not Significant Cell I Clinically Equivalent, Statistically Different Cell II Clinically Equivalent, Not Statistically Different Cell III Not Clinically Equivalent, Statistically Different Significant Not Significant Clinical Equivalency Test Cell IV Not Clinically Equivalent, Not Statistically Different Example of Reference-Group Interpretation: SEAR Questionnaire Clinical and Statistical Significance Dysfunctional population vs. functional population (type of anchor-based method) Classification of tests using statistical significance and clinical equivalence

37 Clinical Significance Adding Control Group Clinical Significance Adding Control Group Confidence intervals were used to determine equivalence (or lack thereof) within a prespecified range Confidence intervals were used to determine equivalence (or lack thereof) within a prespecified range 0.5 SD of domain score in control group 0.5 SD of domain score in control group Rogers et al. Psychological Bulletin 1993; 113: Rogers et al. Psychological Bulletin 1993; 113: Cappelleri et al. Journal of Sexual Medicine 2006; 3: Cappelleri et al. Journal of Sexual Medicine 2006; 3: ED group before treatment vs. control sample ED group before treatment vs. control sample ED group after treatment vs. control sample ED group after treatment vs. control sample Example of Reference-Group Interpretation: SEAR Questionnaire

38 Descriptive Statistics* Sildenafil Trial (n=93) SEAR Component Control Sample (n=94) Pretreatment: Baseline Post-treatment: End of Treatment 1. Sexual Relationship 74 (24)42 (21.7)78 (21) 2. Confidence 82 (22)55 (25.5)81 (21) a) Self-Esteem 84 (23)52 (26.9)81 (22) b) Overall Relationship 80 (25)62 (29.9)80 (24) 3. Overall score 78 (22)48 (21.6)79 (20) *Data are mean (SD) Example of Reference-Group Interpretation: SEAR Questionnaire

39 Example of Reference-Group Interpretation: SEAR Questionnaire Sexual Relationship Satisfaction Control Mean (n=94) minus Pretreatment Mean (n=93): Statistically Different and Not Clinically Equivalent Mean and Confidence Interval Equivalency Interval 40 Control Mean (n=94) minus Posttreatment Mean (n=93): Clinically Equivalent and Not Statistically Different % CI % CI % CI % CI Note: Same conclusion, similar results for other domains

40 Example of Reference-Group Interpretation: Medical Outcomes Study (MOS) Sleep Scale Baseline MOS Sleep Scale scores taken from two double-blind placebo-controlled clinical trials (with pregabalin) for patients with fibromyalgia Baseline MOS Sleep Scale scores taken from two double-blind placebo-controlled clinical trials (with pregabalin) for patients with fibromyalgia Cappelleri et al. Sleep Medicine 2009; 10: Cappelleri et al. Sleep Medicine 2009; 10: These scores were compared using a one-sample Z test with scores (assumed fixed) obtained from a nationally representative sample in the United States These scores were compared using a one-sample Z test with scores (assumed fixed) obtained from a nationally representative sample in the United States Hays et al. Sleep Medicine 2005; 6:41-44 Hays et al. Sleep Medicine 2005; 6:41-44 Patients MOS Sleep Scale scores were statistically (P<0.001) and substantially poorer than general population normative values in the United States Patients MOS Sleep Scale scores were statistically (P<0.001) and substantially poorer than general population normative values in the United States

41 Example of Reference-Group Interpretation: MOS Sleep Scale MOS Sleep ScaleLIFT StudyRELIEF Study United States Normative Values nMean±SD95% CInMean±SD95% CI Sleep Disturbance ± , ± , Snoring ± , ± , Awaken Short of Breath or with Headache ± , ± , Quantity of Sleep (hours)7475.4±1.65.3, ±1.65.5, Optimal Sleep (% with 7 or 8 hours ) ± , ± , Sleep Adequacy ± , ± , Somnolence ± , ± , Sleep Problem Index II ± , ± ,

42 Moving Beyond the 2009 PRO Guidance: Content-based Interpretation to Enhance Interpretation of PROs: Variation of Anchor-based Approach

43 Content-based Interpretation Content-based Interpretation Uses a representative (anchor) item on multi-item PRO, along with its response categories, internal to the measure itself Uses a representative (anchor) item on multi-item PRO, along with its response categories, internal to the measure itself Item response theory Item response theory Logistic models with binary or ordinal outcomes Logistic models with binary or ordinal outcomes Observed proportions Observed proportions

44 Example of Content-based Interpretation: Self-Esteem on SEAR Cappelleri JC, Bell SS, Siegel RL. Interpretation of a self-esteem subscale for erectile dysfunction by cumulative logit model. Drug Information Journal 2007; 41: A non-treatment cross-sectional study with 98 men with erectile dysfunction and 94 controls. The ordinal response item “I had good self-esteem” over the past 4 weeks (1=almost never/never, 2=a few times, 3=sometimes, 4=most times, 5=almost always/always): “Good Self-esteem” was either Category 4 or 5.

45 Example of Content-based Interpretation: Enhanced interpretation of instrument scales using the Rasch model (Thompson et al. Drug Information Journal 2007; 41: )

46 Moving Beyond the 2009 PRO Guidance: Distribution-based Methods to Enhance Interpretation of PROs

47 Distribution-based Methods Distribution-based Methods Adjunct to, not substitute for, anchor-based methods Informs on meaning of change in PROs but not whether change is clinically significant to patients Mean change to standard deviation (SD) Mean change to standard deviation (SD) Signal-to-noise ratio Signal-to-noise ratio Effect size and standardized response mean Effect size and standardized response mean Small, moderate, large effects Small, moderate, large effects Standard error of measurement (reliability-adjusted SD) Standard error of measurement (reliability-adjusted SD) Probability of relative benefit Probability of relative benefit

48 Distribution-based Methods Distribution-based Methods Effect size = magnitude of effect relative to variability Effect size = magnitude of effect relative to variability 0.2 SD, ‘small’; 0.5 SD, ‘medium’; 0.8 SD, ‘large’ 0.2 SD, ‘small’; 0.5 SD, ‘medium’; 0.8 SD, ‘large’ Within group: before vs. after therapy Within group: before vs. after therapy Between groups: treatments A vs. B Between groups: treatments A vs. B Both types: responders vs. non-responders Both types: responders vs. non-responders

49 Distribution-based Methods Within group Within group Effect = average difference score on PRO Effect = average difference score on PRO Variability = baseline standard deviation (SD) Variability = baseline standard deviation (SD) Or variability = SD of individual changes Or variability = SD of individual changes Between groups Between groups Effect = average change between groups at follow-up Effect = average change between groups at follow-up Variability = pooled between-group SD at baseline Variability = pooled between-group SD at baseline Or variability = pooled between-group SD at follow-up Or variability = pooled between-group SD at follow-up Or variability = pooled SD of individual changes Or variability = pooled SD of individual changes

50 For an effect size of 0.42, the score of the average individual in the treated group would have exceeded that of 66.3% of controls [Pr (X < x) = Pr (X < 0.42) = 0.66 from standard normal table] Effect Size Interpretation: Graphical Depiction

51 Example: Effect Size Example: Effect Size Althof et al. Urology 2003; 61: Althof et al. Urology 2003; 61: Cappelleri et al. International Journal of Impotence Research 2004; 16:30-38 Cappelleri et al. International Journal of Impotence Research 2004; 16:30-38 Treatment responsiveness of the SEAR questionnaire in erectile dysfunction Treatment responsiveness of the SEAR questionnaire in erectile dysfunction Sexual relationship satisfaction, confidence Sexual relationship satisfaction, confidence Self-esteem, overall relationship satisfaction, overall Self-esteem, overall relationship satisfaction, overall Each component can range from 0 to 100 (best) Each component can range from 0 to 100 (best) Interest in change scores to gauge magnitude Interest in change scores to gauge magnitude 93 men with ED in a 10-week open-label trial 93 men with ED in a 10-week open-label trial 50-mg sildenafil (adjustable 25 mg or 100 mg) 50-mg sildenafil (adjustable 25 mg or 100 mg)

52 Example: Effect Size Example: Effect Size Effect size for all subjects Effect size for all subjects Effect size = Mean difference score SD at baseline Effect size = Mean difference score SD at baseline

53 Sexual Relationship 42  2278  2136  Confidence 55  2681   Self-esteem 52  2781  2229  Overall Relationship 62  3080  2418  Overall 48  2279   SEAR Baseline End Effect Component Mean ± SD Mean ± SD Difference Size__ Notes: i) Effect sizes of 0.2, 0.50, and 0.80 have been generally regarded, respectively, as “small,” “medium,” and “large” ii) For all scores, P= on the paired data (final – baseline) using a paired t-test iii) Similar results reported in two double-blind placebo controlled trials of sildenafil (Althof et al. J Gen Intern Med 2006; ) Example: Effect Size Example: Effect Size

54 Example: Probability of Relative Benefit Cappelleri et al. BJU International 2008; 101: Cappelleri et al. BJU International 2008; 101: Two 12-week, double-blind, placebo-controlled, flexible- dose sildenafil trials on Self-Esteem and Relationship (SEAR) questionnaire for men with erectile dysfunction Two 12-week, double-blind, placebo-controlled, flexible- dose sildenafil trials on Self-Esteem and Relationship (SEAR) questionnaire for men with erectile dysfunction Difference (sildenafil versus placebo) in SEAR from baseline to week 12 was evaluated with a Wilcoxon rank- sum test using ridit analysis Difference (sildenafil versus placebo) in SEAR from baseline to week 12 was evaluated with a Wilcoxon rank- sum test using ridit analysis

55 Example: Probability of Relative Benefit All p values < Across all items, average probability was 0.67 (standard deviation of 0.04)

56 Moving Beyond the 2009 PRO Guidance: Mediation Analysis Moving Beyond the 2009 PRO Guidance: Mediation Analysis

57 Mediation Models Seeks to identify or confirm the mechanism that underlies an observed relationship between a predictor (X) and an outcome (Y) via the inclusion of an intermediate or mediator variable (M) For example: X = treatment (vs. control), M = pain, Y = sleep disturbance Note: Direct effect = b, indirect effect = a*c, total effect = b + a*c

58 Example: Russell et al. Sleep Medicine 2009; 10: Reduced Sleep Disturbance Direct Effect (73%; p<0.01) Pregabalin 450 mg/day (vs Placebo) Decreased Pain Indirect Effect (27%; p=0.01) Total Effect = 12.7 reduction (improvement)

59 The total effect of an independent variable on a dependent variable can be divided into direct effects and indirect effects through one or more mediator variables A statistical mediation model estimates the relative contributions of direct and indirect effects of an independent variable on a dependent variable Independent variable Dependent variable Direct Indirect effect Mediator variable (all other effects) Mediation Modeling Depiction a b c

60 Let X = independent variable, Y = dependent variable and M = mediator variable Total effect of X on Y is measured by d in the simple regression equation: Y = intercept1 + d * X Consider the two simultaneous regression equations: Y = intercept2 + b * X + c * M M = intercept3 + a * X Complete mediation is the case in which the variable X no longer affects Y so the direct path coefficient b is zero Cross-product of a and c (a*c) refers to the indirect effect of X on Y No mediation occurs when the total effect of X on Y exists entirely through the direct effect, so that b is non-zero and a*c is zero Partial mediation is the case in which the direct path (b) and indirect path (a*c) are both non-zero Mediation Modeling Equations

61 Recent Special Issue on PROs: Statistical Methods in Medical Research (Published online 19 February 2013) Bell M, Fairclough D. Practical and statistical issues in missing data for longitudinal patient reported outcomes Bell M, Fairclough D. Practical and statistical issues in missing data for longitudinal patient reported outcomes Cappelleri JC, Bushmakin AG. Interpretation of patient-reported outcomes Cappelleri JC, Bushmakin AG. Interpretation of patient-reported outcomes Izem R, Kammerman LA, Komo S. Statistical challenges in drug approval trials that use patient-reported outcomes Izem R, Kammerman LA, Komo S. Statistical challenges in drug approval trials that use patient-reported outcomes Julious SA, Walters SJ. Estimating effect sizes for health related quality of life outcomes Julious SA, Walters SJ. Estimating effect sizes for health related quality of life outcomes Massof RW. A general theoretical framework for interpreting patient- reported outcomes estimated from ordinally scaled item responses Massof RW. A general theoretical framework for interpreting patient- reported outcomes estimated from ordinally scaled item responses

62 Some Noteworthy Books on PROs Cappelleri JC, Zou KH, Bushmakin AG, Alvir JMJ, Alemayehu D, Symonds T. Patient-Reported Outcomes: Measurement, Implementation and Interpretation. In press. Boca Raton, Florida: Chapman & Hall/CRC; December Cappelleri JC, Zou KH, Bushmakin AG, Alvir JMJ, Alemayehu D, Symonds T. Patient-Reported Outcomes: Measurement, Implementation and Interpretation. In press. Boca Raton, Florida: Chapman & Hall/CRC; December de Vet HCW, Terwee CB, Mokkink LB, Knol DL Measurement in Medicine: A Practical Guide. New York, NY: Cambridge University Press. de Vet HCW, Terwee CB, Mokkink LB, Knol DL Measurement in Medicine: A Practical Guide. New York, NY: Cambridge University Press. Fayers FM, Machin D. Quality of Life: The Assessment, Analysis and Interpretation of Patient-reported Outcomes. 2nd ed. Chichester, England: John Wiley & Sons Ltd.; Fayers FM, Machin D. Quality of Life: The Assessment, Analysis and Interpretation of Patient-reported Outcomes. 2nd ed. Chichester, England: John Wiley & Sons Ltd.; Fairclough DL. Design and Analysis of Quality of Life Studies in Clinical Trials. 2nd ed. Boca Raton, Florida: Chapman & Hall/CRC; Fairclough DL. Design and Analysis of Quality of Life Studies in Clinical Trials. 2nd ed. Boca Raton, Florida: Chapman & Hall/CRC; Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to Their Development and Use. 4th ed. New York, NY: Oxford University Press; Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to Their Development and Use. 4th ed. New York, NY: Oxford University Press; 2008.

63 Summary: Learning Objectives Part 1: To understand the methods for interpretation of patient-reported outcomes for label and promotional claims Part 1: To understand the methods for interpretation of patient-reported outcomes for label and promotional claims Responder analysis Responder analysis Anchor-based approaches Anchor-based approaches Cumulative distribution function Cumulative distribution function Examples Examples Part 2: To move beyond the 2009 FDA guidance – extended approaches Part 2: To move beyond the 2009 FDA guidance – extended approaches Cumulative proportion and responder analysis Cumulative proportion and responder analysis Reference-group interpretation Reference-group interpretation Content-based interpretation Content-based interpretation Distribution-based methods Distribution-based methods Mediation Models Mediation Models Examples Examples


Download ppt "Interpreting of Patient-Reported Outcomes Joseph C. Cappelleri, PhD, MPH Pfizer Inc ("

Similar presentations


Ads by Google