Presentation is loading. Please wait.

Presentation is loading. Please wait.

542-05-#1 STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.

Similar presentations


Presentation on theme: "542-05-#1 STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981."— Presentation transcript:

1 542-05-#1 STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.

2 542-05-#2 Sample Size Issues Fundamental Point Trial must have sufficient statistical power to detect differences of clinical interest High proportion of published negative trials do not have adequate power Freiman et al, NEJM (1978) 50/71 could miss a 50% benefit

3 542-05-#3 Example: How many subjects? Compare new treatment (T) with a control (C) Previous data suggests Control Failure Rate (P c ) ~ 40% Investigator believes treatment can reduce P c by 25% i.e. P T =.30, P C =.40 N = number of subjects/group?

4 542-05-#4 Estimates only approximate –Uncertain assumptions –Over optimism about treatment –Healthy screening effect Need series of estimates –Try various assumptions –Must pick most reasonable Be conservative yet be reasonable

5 542-05-#5 Statistical Considerations Null Hypothesis (H 0 ): No difference in the response exists between treatment and control groups Alternative Hypothesis (H a ): A difference of a specified amount (  ) exists between treatment and control Significance Level (  ): Type I Error The probability of rejecting H 0 given that H 0 is true Power = (1 -  ): (  = Type II Error) The probability of rejecting H 0 given that H 0 is not true

6 542-05-#6 Standard Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

7 542-05-#7 Standard Normal Table Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

8 542-05-#8 Distribution of Sample Means (1) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

9 542-05-#9 Distribution of Sample Means (2) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

10 542-05-#10 Distribution of Sample Means (3) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

11 542-05-#11 Distribution of Sample Means (4) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

12 542-05-#12 Test Statistics

13 542-05-#13 Distribution of Test Statistics Many have this common form Testing a population parameter (eg difference in means) Θ = sample estimate of a population parameter Then –Z = [Θ – E(Θ)]/√V(Θ) And then Z has a Normal (0,1) distribution for large sample size

14 542-05-#14 If statistic z is large enough (e.g. falls into red area of scale), we believe this result is too large to have come from a distribution with mean O (i.e. P c - P t = 0) Thus we reject H 0 : P c - P t = 0, claiming that their exists 5% chance this result could have come from distribution with no difference

15 542-05-#15 Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.

16 542-05-#16 Two Groups ORor

17 542-05-#17 Test of Hypothesis Two sidedvs.One sided e.g. H 0 : P T = P C H 0 : P T < P C Classic testz  = critical value If |z| > z  If z > z  Reject H 0  =.05, z  = 1.96  =.05, z  = 1.645 where z = test statistic Recommend z  be same value both cases (e.g. 1.96) two-sided one-sided   =.05 or =.025 z  = 1.96 1.96

18 542-05-#18 Typical Design Assumptions (1) 1.  =.05,.025,.01 2.Power =.80,.90 Should be at least.80 for design 3.  = smallest difference hope to detect e.g.  = P C - P T =.40 -.30 =.1025% reduction!

19 542-05-#19 Typical Design Assumptions (2) Two Sided PowerSignificance Level

20 542-05-#20 Sample Size Exercise How many do I need? Next question, what’s the question? Reason is that sample size depends on the outcome being measured, and the method of analysis to be used

21 542-05-#21 Simple Case - Binomial 1.H 0 :P C = P T 2.Test Statistic (Normal Approx.) 3.Sample Size Assume N T = N C = N H A :  = P C - P T

22 542-05-#22 Sample Size Formula (1) Two Proportions Simple Case Z  = constant associated with  P {|Z|> Z  } =  two sided! (e.g.  =.05, Z  =1.96) Z  = constant associated with 1 -  P {Z< Z  } = 1-  (e.g. 1-  =.90, Z  =1.282) Solve for Z  (  1-  ) or 

23 542-05-#23 Sample Size Formula (2) Two Proportions Z  = constant associated with  P {|Z|> Z  } =  two sided! (e.g.  =.05, Z  =1.96) Z  = constant associated with 1 -  P {Z< Z  } = 1-  (e.g. 1-  =.90, Z  =1.282)

24 542-05-#24 Sample Size Formula Power Solve for Z    1-  Difference Detected Solve for 

25 542-05-#25 Simple Example (1) H 0 : P C = P T H A : P C =.40, P T =.30  =.40 -.30 =.10 Assume  =.05Z  = 1.96 (Two sided) 1 -  =.90Z  = 1.282 p = (.40 +.30 )/2 =.35

26 542-05-#26 Simple Example (2) Thus a. N = 476 2N = 952 b. 2N = 956 N = 478

27 542-05-#27 Approximate* Total Sample Size for Comparing Various Proportions in Two Groups with Significance Level (  ) of 0.05 and Power (1-  ) of 0.80 and 0.90

28 542-05-#28

29 542-05-#29 Comparison of Means Some outcome variables are continuous –Blood Pressure –Serum Chemistry –Pulmonary Function Hypothesis tested by comparison of mean values between groups, or comparison of mean changes

30 542-05-#30 Comparison of Two Means H 0 :  C =  T  C -  T = 0 H A :  C -  T =  Test statistic for sample means ~ N (  ) Let N = N C = N T for design ~N(0,1) for H 0

31 542-05-#31 Comparison of Means Power Calculation

32 542-05-#32 Example e.g. IQ  = 15  = 0.3x15 = 4.5 Set 2  =.05  = 0.10 1 -  = 0.90 H A :  = 0.3    /  = 0.3 Sample Size N = 234  2N = 468

33 542-05-#33

34 542-05-#34 Comparing Time to Event Distributions Primary efficacy endpoint is the time to an event Compare the survival distributions for the two groups Measure of treatment effect is the ratio of the hazard rates in the two groups = ratio of the medians Must also consider the length of follow-up

35 542-05-#35 Assuming Exponential Survival Distributions Then define the effect size by Standard difference

36 542-05-#36 Time to Failure (1) Use a parametric model for sample size Common model - exponential –S(t) = e - t = hazard rate –H 0 : I = C –Estimate N George & Desu (1974) Assumes all patients followed to an event (no censoring) Assumes all patients immediately entered

37 542-05-#37 Assuming Exponential Survival Distributions Simple case The statistical test is powered by the total number of events observed at the time of the analysis, d.

38 542-05-#38 Converting Number of Events (D) to Required Sample Size (2N) d = 2N x P(event) 2N = d/P(event) P(event) is a function of the length of total follow- up at time of analysis and the average hazard rate Let AR = accrual rate (patients per year) A = period of uniform accrual (2N = AR x A) F = period of follow-up after accrual complete A/2 + F = average total follow-up at planned analysis = average hazard rate Then P(event) = 1 – P(no event) =

39 542-05-#39 Time to Failure (2) In many clinical trials 1.Not all patients are followed to an event (i.e. censoring) 2.Patients are recruited over some period of time (i.e. staggered entry) More General Model (Lachin, 1981) where g( ) is defined as follows

40 542-05-#40 1.Instant Recruitment Study Censored At Time T 2.Continuous Recruiting (O,T) & Censored at T 3.Recruitment (O, T 0 ) & Study Censored at T (T > T 0 )

41 542-05-#41 Example Assume  =.05 (2-sided) & 1 -  =.90 C =.3 and I =.2 T = 5 years follow-up T 0 = 3 0.No Censoring, Instant Recruiting N = 128 1.Censoring at T, Instant Recruiting N = 188 2.Censoring at T, Continual Recruitment N = 310 3.Censoring at T, Recruitment to T 0 N = 233

42 542-05-#42 Sample Size Adjustment for Non-Compliance (1) References: 1.Shork & Remington (1967) Journal of Chronic Disease 2.Halperin et al (1968) Journal of Chronic Disease 3.Wu, Fisher & DeMets (1988) Controlled Clinical Trials Problem Some patients may not adhere to treatment protocol Impact Dilute whatever true treatment effect exists

43 542-05-#43 Sample Size Adjustment for Non-Compliance (2) Fundamental Principle Analyze All Subjects Randomized Called Intent-to-Treat (ITT) Principle –Noncompliance will dilute treatment effect A Solution Adjust sample size to compensate for dilution effect (reduced power) Definitions of Noncompliance –Dropout: Patient in treatment group stops taking therapy –Dropin: Patient in control group starts taking experimental therapy

44 542-05-#44 Comparing Two Proportions –Assumes event rates will be altered by non ‑ compliance –Define P T * = adjusted treatment group rate P C * = adjusted control group rate If P T < P C, 0 PTPT PCPC P T * P C * 1.0

45 542-05-#45 Simple Model - Compute unadjusted N –Assume no dropins –Assume dropout proportion R –ThusP C * = P C P T * = (1-R) P T + R P C –Then adjust N –Example R1/(1-R) 2 % Increase.1 1.23 23%.25 1.78 78% Adjusted Sample Size

46 542-05-#46 Sample Size Adjustment for Non-Compliance Dropouts & dropins (R 0, R I ) –Example R 0 R 1 1/(1- R 0 - R 1 ) 2 % Increase.1.1 1.5656%.25.25 4.04 times%

47 542-05-#47 More Complex Model Ref: Wu, Fisher, DeMets (1980) Further Assumptions –Length of follow-up divided into intervals –Hazard rate may vary –Dropout rate may vary –Dropin rate may vary –Lag in time for treatment to be fully effective Sample Size Adjustments

48 542-05-#48 Used complex model Assumptions 1.  =.05 (Two sided)1 -  =.90 2.3 year follow-up 3.P C =.18(Control Rate) 4.P T =.13Treatment assumed 28% reduction 5.Dropout 26% (12%, 8%, 6%) 6.Dropin 21% (7%, 7%, 7%) Example: Beta-Blocker Heart Attack Trial (BHAT) (1)

49 542-05-#49 UnadjustedAdjusted P C =.18P C * =.175 P T =.13P T * =.14 28% reduction20% reduction N = 1100N* = 2000 2N = 22002N* = 4000 Example: Beta-Blocker Heart Attack Trial (BHAT) (2)

50 542-05-#50 Multiple Response Variables Many trials measure several outcomes (e.g. MILIS, NOTT) Must force investigator to rank them for importance Do sample size on a few outcomes (2-3) If estimates agree, OK If not, must seek compromise

51 542-05-#51 “Equivalency” or Non-Inferiority Trials Compare new therapy with standard Wish to show new "as good as" Rationale may be cost, toxicity, profit Examples –Intermittent Positive Pressure Breathing Trial Expensive IPPB vs. Cheaper Treatment –Nocturnal Oxygen Therapy Trial (NOTT) 12 Hours Oxygen vs. 24 Hours Problem Can't show H 0 :  = 0 A Solution Specify minimum difference =  min

52 542-05-#52 Sample Size Formula Two Proportions Simple Case Z  = constant associated with  Z  = constant associated with 1 -  Solve for Z  (  1-  ) or 

53 542-05-#53 Difference in Events Test Drug – Standard Drug

54 542-05-#54 Mid Stream Adjustments Murphy's Law applies to sample size May find event rate assumptions way off from early results, power of study very inadequate Problem –Quit? –Continue for almost certain doom? –Adjust sample size? –Extend followup? Early Decision Best to decide early, not look at treatment comparisons

55 542-05-#55 Adaptive Designs One class allows re-estimating the sample size once the trial is underway –Chung et al –Chen, Lan & DeMets Methods have been criticized for allowing bias (eg Mehta & Tsiatis) Thus, methods still not widely used –AHEFT Trial one example Will be discussed later in data monitoring lecture

56 542-05-#56 Event Rate Assumptions Challenging to get event rate assumptions correct Inclusion/exclusion criteria effect Healthy volunteer effect Changing background therapy/standard of care Even if trials conducted back to back

57 542-05-#57 PRAISE I vs PRAISE II Placebo arms

58 542-05-#58 Event Driven Trials For time to event trials, most of the information is in the events Power is a function of the events For time to event trials, target is really number of observed events (D), not the total sample size (2N) Thus, target the number of events

59 542-05-#59 Event Driven Trials Can adjust or adapt trial to target the number of events if the assumed event rate was too high Steering committee can –Increase sample size –Increase follow up –A combination of both

60 542-05-#60 Examples of Event Driven Trials PROMISE (Based on control arm) PRAISE I & II COPERNICUS CARS (Based on control arm)

61 542-05-#61 Response Adaptive Designs The size of the observed treatment effect may be different (i.e. less than) from assumptions –Treatment actually less effective –Compliance worse than assumed –Background therapy changed Smaller observed effect may be still of clinical interest if real

62 542-05-#62 Response Adaptive Designs Also, probability of rejecting H0 is also small –Power –Conditional Power Question is whether to –quit and start over or –make design modification and continue

63 542-05-#63 Response Adaptive Designs Stopping and starting over problematic –Waste of financial resources –Ethical issues of wasting contributions of patients who have already participated Probably can’t afford a policy of designing all trials for minimum treatment effect of interest

64 542-05-#64 Response Adaptive Designs Adjust/increase sample size if treatment effect assumed was too large Traditionally, this approach discouraged Recent methodology suggests possible approaches

65 542-05-#65 Response Adaptive Designs These methods are relatively new and still controversial Many leading biostatisticians very critical (e.g., Fleming, Emerson, Turnbull, Tsiatis) Issues often more than statistical control of Type I error –Introducing other sources of bias

66 542-05-#66 Response Adaptive Designs Increase sample size based on observed treatment effect May inflate false positive rate –By 30 to 40% (Cui et al) –Can double (Proschan et al) Inflation of Type I error of that magnitude not acceptable

67 542-05-#67 Response Adaptive Designs Statistical adjustments to control alpha –Weighted z-statistic –Adjustment to the critical value –enforcing rules for sample size recalculation

68 542-05-#68 Weighted Z Statistic Reference –Cui, Hung & Wang (1999,Biometrics) –Fisher(1998,Stat Med) –Shen & Fisher (1999, Biometrics) –Tsiatis & Mehta ( 2003, Biometrika)

69 542-05-#69 Weighted Z X i = N(0,1) distribution n = current sample size N 0 = initial total sample size  a = hypothesized treatment effect t = n/ N 0

70 542-05-#70 Weighted Z N= proposed sample size based on Reject H 0 if Note: less weight assigned to new/additional observations

71 542-05-#71 Weighted Z Possible to modify design, increase sample size based on interim analysis & control Type I error Flexibility has a price

72 542-05-#72 Tsiatis – Mehta Criticism Argue that a properly designed group sequential trial is more efficient than these adaptive designs Challenge is to “properly” design (However, that can be a bigger challenge than often realized)

73 542-05-#73 Weighted/UnWeighted Modification Both Type I error <  No real loss of power Ref: Chen, DeMets, Lan

74 542-05-#74 P-Value Method P-Value Method Reference –Proschan & Hunsberger (1995, Biometrics) Requires a “promising” p-value before allowing an increase in sample size Requires stopping if first stage p-value not promising Requires a larger critical value at the “second stage” to control the Type I error

75 542-05-#75 P-value Method One sided alpha = 0.05 P(1).10.15.20.25.50 Z(2) 1.77 1.82 1.85 1.875 1.95 Regardless of n 2, second stage

76 542-05-#76 Proschan & Hunsberger Method Simple method may make Type I error substantially less than 0.05 Developed another method to obtain exact Type I error as a function of Z 1 and n 2, using a conditional power type calculation (details to be discussed later)

77 542-05-#77 Proschan & Hunsberger Conditional Power and p value required in stage 2 as a function of R = n 2 /n 1 for the NHLBI Type II study example

78 542-05-#78 Proschan & Hunsberger Allows for sample size adjustment based on observed treatment effect Requires increasing final critical value

79 542-05-#79 Adaptive Design Remarks A need exists for adaptive designs (even FDA statisticians agree) Technical advances have been made through several new methods Adaptive designs are still not widely accepted & subject to (strong) criticism May be useful for non pivotal trials Practice precedes theory, perhaps in time

80 542-05-#80 Sample Size Summary Ethically, the size of the study must be large enough to achieve the stated goals with reasonable probability (power) Sample size estimates are only approximate due to uncertainty in assumptions Need to be conservative but realistic

81 542-05-#81 Demo of Sample Size Program Demo of Sample Size Program www.biostat.wisc.edu Program covers comparison of proportions, means, & time to failure Can vary control group rates or responses, alpha & power, hypothesized differences Program develops sample size table and a power curve for a particular sample size

82 542-05-#82 Sample Size Program Output

83 542-05-#83 Union Terrace/Lakefront


Download ppt "542-05-#1 STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981."

Similar presentations


Ads by Google