Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics.

Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics Workshop Washington, September 16, 2005 Jeff A. Sloan, Ph.D. Mayo Clinic, Rochester, MN, USA

Primary goal: advance the state of the science to help cancer patient QOL soar

Take home message: there is good news There are problems with using QOL assessments as indicators of efficacy in clinical trials. There are problems with using QOL assessments as indicators of efficacy in clinical trials. There are scientifically sound solutions to these problems. The problems have been disseminated widely and consistently. The solutions have not. There are scientifically sound solutions to these problems. The problems have been disseminated widely and consistently. The solutions have not. There are problems with using QOL assessments as indicators of efficacy in clinical trials. There are problems with using QOL assessments as indicators of efficacy in clinical trials. There are scientifically sound solutions to these problems. The problems have been disseminated widely and consistently. The solutions have not. There are scientifically sound solutions to these problems. The problems have been disseminated widely and consistently. The solutions have not.

It takes a certain amount of bravery to work in QOL research

Science is a candle in the dark - Carl Sagan - Carl Sagan We will use the candle of science to improve the QOL of cancer patients

How do you determine the clinical significance the clinical significance of QOL assessments? of QOL assessments?

What is a clinically meaningful QOL burden?

Why is it difficult to define clinical significance for QOL? Pain analogy 25 years ago physicians were the sole raters of patient pain JCAHO 2000 guideline: every patients pain to be assessed upon intake on a 0-10 scale Time and experience alleviates novelty and skepticism, and guidelines evolve Pain analogy 25 years ago physicians were the sole raters of patient pain JCAHO 2000 guideline: every patients pain to be assessed upon intake on a 0-10 scale Time and experience alleviates novelty and skepticism, and guidelines evolve

Why is it difficult to define clinical significance for QOL? Blood pressure analogy 100 years ago, clinical significance of BP scores was unknown (Lancet 1899) massage therapy was the gold standard present guidelines for BP clinical significance today redrawn (McCrory DC. Lewis SZ. Chest. 126(1 Suppl):11S- 13S, 2004) Blood pressure analogy 100 years ago, clinical significance of BP scores was unknown (Lancet 1899) massage therapy was the gold standard present guidelines for BP clinical significance today redrawn (McCrory DC. Lewis SZ. Chest. 126(1 Suppl):11S- 13S, 2004)

The solution found for tumor response cutoffs may provide guidance We call a reduction of 50% a response. We call a reduction of 50% a response. Have reductions of 49% all the time, but do not worry about misclassification. Have reductions of 49% all the time, but do not worry about misclassification. Moertel (1976) basis for 50% cutoff Moertel (1976) basis for 50% cutoff Find a cutoff and stick to it? (RECIST) Find a cutoff and stick to it? (RECIST) We call a reduction of 50% a response. We call a reduction of 50% a response. Have reductions of 49% all the time, but do not worry about misclassification. Have reductions of 49% all the time, but do not worry about misclassification. Moertel (1976) basis for 50% cutoff Moertel (1976) basis for 50% cutoff Find a cutoff and stick to it? (RECIST) Find a cutoff and stick to it? (RECIST)

What Clinical significance is NOT Statistical significance Statistical significance Example drawn from JCO 2001 (anonymous) Example drawn from JCO 2001 (anonymous) HSQ before / after scores on 1300 patients HSQ before / after scores on 1300 patients all p-values <0.0001 all p-values <0.0001 conclusion: all domains of QOL were significantly different across treatment groups conclusion: all domains of QOL were significantly different across treatment groups problem: 1300 patients provides 80% power to detect a change of 1 unit on 0-100 point scale problem: 1300 patients provides 80% power to detect a change of 1 unit on 0-100 point scale Statistical significance Statistical significance Example drawn from JCO 2001 (anonymous) Example drawn from JCO 2001 (anonymous) HSQ before / after scores on 1300 patients HSQ before / after scores on 1300 patients all p-values <0.0001 all p-values <0.0001 conclusion: all domains of QOL were significantly different across treatment groups conclusion: all domains of QOL were significantly different across treatment groups problem: 1300 patients provides 80% power to detect a change of 1 unit on 0-100 point scale problem: 1300 patients provides 80% power to detect a change of 1 unit on 0-100 point scale

EORTC QLQ-LC13 Itemn=537n=346Effect Size Itemn=537n=346Effect Size Coughing46.244.3small Coughing46.244.3small Dyspnea17.216.2small Dyspnea17.216.2small Pain26.925.5small Pain26.925.5small all p-values were statistically significant all p-values were statistically significant Itemn=537n=346Effect Size Itemn=537n=346Effect Size Coughing46.244.3small Coughing46.244.3small Dyspnea17.216.2small Dyspnea17.216.2small Pain26.925.5small Pain26.925.5small all p-values were statistically significant all p-values were statistically significant

The Six Papers 1) Methods used to date 2) Group versus individual differences 3) Single item versus multi-item 4) Patient, clinician, population perspectives 5) Changes over time 6) Practical considerations for specific audiences MCP, April, May, June 2002 MCP, April, May, June 2002 1) Methods used to date 2) Group versus individual differences 3) Single item versus multi-item 4) Patient, clinician, population perspectives 5) Changes over time 6) Practical considerations for specific audiences MCP, April, May, June 2002 MCP, April, May, June 2002

No single statistical decision rule or procedure can take the place of well- reasoned consideration of all aspects of the data by a group of concerned, competent, and experienced persons with a wide range of scientific backgrounds and points of view. Canner (1981) If it looks like a duck, sounds like a duck, and walks like a duck, the odds of it being a worm or an elephant in a clever disguise are small in the extreme. Sloan (2001)

Bottom Line Assessing the clinical significance of QOL can be as simple as a 10-point change on a 100-point scale, if that is consistent with the goals of the scientific enquiry. The real issue underlying the controversy over QOL is the relative novelty and lack of experience that presently exists with QOL. With time and familiarity this too shall pass. (Sloan, J Chronic Obs. Pul. Dis. 2: 57-62, 2005.) Assessing the clinical significance of QOL can be as simple as a 10-point change on a 100-point scale, if that is consistent with the goals of the scientific enquiry. The real issue underlying the controversy over QOL is the relative novelty and lack of experience that presently exists with QOL. With time and familiarity this too shall pass. (Sloan, J Chronic Obs. Pul. Dis. 2: 57-62, 2005.)

Presenting global solutions is always interesting me you

Two general methods for clinical significance Anchor-based methods requirements Anchor-based methods requirements independent interpretable measure (the anchor) which has appreciable correlation between anchor and target independent interpretable measure (the anchor) which has appreciable correlation between anchor and target Distribution-based methods Distribution-based methods rely on expression of magnitude of effect in terms of measure of variability of results (effect size) rely on expression of magnitude of effect in terms of measure of variability of results (effect size) Anchor-based methods requirements Anchor-based methods requirements independent interpretable measure (the anchor) which has appreciable correlation between anchor and target independent interpretable measure (the anchor) which has appreciable correlation between anchor and target Distribution-based methods Distribution-based methods rely on expression of magnitude of effect in terms of measure of variability of results (effect size) rely on expression of magnitude of effect in terms of measure of variability of results (effect size)

The MID method in one slide

The Empirical Rule Effect Size (ERES) Approach (Sloan et al, Cancer Integrative Medicine 1(1):41-47, 2003) QOL tool range = 6 standard Deviations QOL tool range = 6 standard Deviations SD Estimate = 100 percent / 6 SD Estimate = 100 percent / 6 = 16.7% of theoretical range = 16.7% of theoretical range Two-sample t-test effect sizes (J Cohen, 1988) : Two-sample t-test effect sizes (J Cohen, 1988) : small, moderate, large effect (0.2, 0.5, 0.8 SD shift) S,M,L effects = 3%, 8%, 12% of range S,M,L effects = 3%, 8%, 12% of range QOL tool range = 6 standard Deviations QOL tool range = 6 standard Deviations SD Estimate = 100 percent / 6 SD Estimate = 100 percent / 6 = 16.7% of theoretical range = 16.7% of theoretical range Two-sample t-test effect sizes (J Cohen, 1988) : Two-sample t-test effect sizes (J Cohen, 1988) : small, moderate, large effect (0.2, 0.5, 0.8 SD shift) S,M,L effects = 3%, 8%, 12% of range S,M,L effects = 3%, 8%, 12% of range

The Empirical Rule Tchebyshevs Theorem: at least 1-1/k 2 of any distribution will fall within k standard deviations (SDs) of the mean Tchebyshevs Theorem: at least 1-1/k 2 of any distribution will fall within k standard deviations (SDs) of the mean If the distribution is symmetric, 99% will fall within 3 standard deviations If the distribution is symmetric, 99% will fall within 3 standard deviations The pdf for the range is a function of the SD The pdf for the range is a function of the SD an estimate of the SD can be obtained via an estimate of the SD can be obtained via range = 6 SD range = 6 SD Tchebyshevs Theorem: at least 1-1/k 2 of any distribution will fall within k standard deviations (SDs) of the mean Tchebyshevs Theorem: at least 1-1/k 2 of any distribution will fall within k standard deviations (SDs) of the mean If the distribution is symmetric, 99% will fall within 3 standard deviations If the distribution is symmetric, 99% will fall within 3 standard deviations The pdf for the range is a function of the SD The pdf for the range is a function of the SD an estimate of the SD can be obtained via an estimate of the SD can be obtained via range = 6 SD range = 6 SD

Assumption Checking for ERES (Dueck, Sloan, 2006, J. Biopharm. Stats, in press) Tchebyshevs Inequality is conservative Tchebyshevs Inequality is conservative Tested the effect of various distributional assumptions Tested the effect of various distributional assumptions Only a uniform distribution results in deviation from the assumption of a 6 SD- based estimate (28% instead of 17%) Only a uniform distribution results in deviation from the assumption of a 6 SD- based estimate (28% instead of 17%) Tchebyshevs Inequality is conservative Tchebyshevs Inequality is conservative Tested the effect of various distributional assumptions Tested the effect of various distributional assumptions Only a uniform distribution results in deviation from the assumption of a 6 SD- based estimate (28% instead of 17%) Only a uniform distribution results in deviation from the assumption of a 6 SD- based estimate (28% instead of 17%)

All Methods Give Similar Answers Cohen - 1/2 SD is moderate effect Cohen - 1/2 SD is moderate effect MCID - 1/2 point on 7-point Likert MCID - 1/2 point on 7-point Likert 7-1 = 6 point range ==> SD of 1 unit 7-1 = 6 point range ==> SD of 1 unit so 1/2 point ==:> 1/2 SD so 1/2 point ==:> 1/2 SD Cella - 10 point on FACT-G Cella - 10 point on FACT-G 10/1.12 = 8.9% / 16.7% = 1/2 SD 10/1.12 = 8.9% / 16.7% = 1/2 SD Feinstein - correlation approach Feinstein - correlation approach Cohen was arbitrary, should be 0.6 SD Cohen was arbitrary, should be 0.6 SD Cohen - 1/2 SD is moderate effect Cohen - 1/2 SD is moderate effect MCID - 1/2 point on 7-point Likert MCID - 1/2 point on 7-point Likert 7-1 = 6 point range ==> SD of 1 unit 7-1 = 6 point range ==> SD of 1 unit so 1/2 point ==:> 1/2 SD so 1/2 point ==:> 1/2 SD Cella - 10 point on FACT-G Cella - 10 point on FACT-G 10/1.12 = 8.9% / 16.7% = 1/2 SD 10/1.12 = 8.9% / 16.7% = 1/2 SD Feinstein - correlation approach Feinstein - correlation approach Cohen was arbitrary, should be 0.6 SD Cohen was arbitrary, should be 0.6 SD

The Good News Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom assessments are all in the same ballpark Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom assessments are all in the same ballpark A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significant A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significant Smaller differences may also be meaningful (data) Smaller differences may also be meaningful (data) Applies to groups or individuals (just different SD) Applies to groups or individuals (just different SD) Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519 Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press). Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom assessments are all in the same ballpark Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom assessments are all in the same ballpark A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significant A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significant Smaller differences may also be meaningful (data) Smaller differences may also be meaningful (data) Applies to groups or individuals (just different SD) Applies to groups or individuals (just different SD) Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519 Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press).

Four Guidelines (Sloan, Cella, Hays, JCE 2005,in press) The method used to obtain an estimate of clinical significance should be scientifically supportable. The method used to obtain an estimate of clinical significance should be scientifically supportable. The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other information, the ½ SD is a reasonable and scientifically supportable estimate of a meaningful effect. The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other information, the ½ SD is a reasonable and scientifically supportable estimate of a meaningful effect. The method used to obtain an estimate of clinical significance should be scientifically supportable. The method used to obtain an estimate of clinical significance should be scientifically supportable. The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other information, the ½ SD is a reasonable and scientifically supportable estimate of a meaningful effect. The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other information, the ½ SD is a reasonable and scientifically supportable estimate of a meaningful effect.

Four Guidelines (Sloan, Cella, Hays, JCE 2005,in press) Effect sizes below ½ SD, supported by data regarding the specific characteristics of a particular QOL assessment or application, may also be meaningful. The minimally important difference may be below ½ SD in such cases. Effect sizes below ½ SD, supported by data regarding the specific characteristics of a particular QOL assessment or application, may also be meaningful. The minimally important difference may be below ½ SD in such cases. If feasible, multiple approaches to estimating a tools clinically meaningful effect size in multiple patient groups are helpful in assessing the variability of the estimates. However, the lack of multiple approaches with multiple groups should not preemptively restrict application of information gained to date. If feasible, multiple approaches to estimating a tools clinically meaningful effect size in multiple patient groups are helpful in assessing the variability of the estimates. However, the lack of multiple approaches with multiple groups should not preemptively restrict application of information gained to date. Effect sizes below ½ SD, supported by data regarding the specific characteristics of a particular QOL assessment or application, may also be meaningful. The minimally important difference may be below ½ SD in such cases. Effect sizes below ½ SD, supported by data regarding the specific characteristics of a particular QOL assessment or application, may also be meaningful. The minimally important difference may be below ½ SD in such cases. If feasible, multiple approaches to estimating a tools clinically meaningful effect size in multiple patient groups are helpful in assessing the variability of the estimates. However, the lack of multiple approaches with multiple groups should not preemptively restrict application of information gained to date. If feasible, multiple approaches to estimating a tools clinically meaningful effect size in multiple patient groups are helpful in assessing the variability of the estimates. However, the lack of multiple approaches with multiple groups should not preemptively restrict application of information gained to date.

SummarySummary Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago Define clinical significance a priori, and use the definition in the analytical process Define clinical significance a priori, and use the definition in the analytical process Consensus is building as the answers from different approaches are similar and relatively robust Consensus is building as the answers from different approaches are similar and relatively robust Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago Define clinical significance a priori, and use the definition in the analytical process Define clinical significance a priori, and use the definition in the analytical process Consensus is building as the answers from different approaches are similar and relatively robust Consensus is building as the answers from different approaches are similar and relatively robust

New ideas have enabled us to make advances in QOL science

A Mayo/FDA meeting regarding guidance on patient-reported outcomes (PRO) Discussion, Education, and Operationalization FDA to release guidances for assessing PROs in all clinical trials (3rd quarter 2005?) Meeting co-sponsored with FDA to: provide a focused process to facilitate discussion among all stakeholders educate stakeholders on background, content, and concerns provide an opportunity for input delineate ways to best operationalize the guidance into clinical trials February 23-25, 2006, DC (Westfields Marriott, Chantilly, VA, 7 miles from Dulles) FDA to release guidances for assessing PROs in all clinical trials (3rd quarter 2005?) Meeting co-sponsored with FDA to: provide a focused process to facilitate discussion among all stakeholders educate stakeholders on background, content, and concerns provide an opportunity for input delineate ways to best operationalize the guidance into clinical trials February 23-25, 2006, DC (Westfields Marriott, Chantilly, VA, 7 miles from Dulles)

The NCCTG QOL Team

Email: jsloan@mayo.edu Thank you

Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics.

Similar presentations

Presentation on theme: "Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics.

Similar presentations

Presentation on theme: "Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics Workshop Washington, September 16, 2005 FDA/Industry Statistics."— Presentation transcript:

Similar presentations

About project

Feedback