Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Sample Size and Power.

Slides:



Advertisements
Similar presentations
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.
Advertisements

Phase II/III Design: Case Study
Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Comparing Two Proportions (p1 vs. p2)
Sample size estimation
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
Statistical Issues in Research Planning and Evaluation
Statistical Decision Making
Estimation of Sample Size
Confidence Intervals © Scott Evans, Ph.D..
SAMPLE SIZE ESTIMATION
Chapter Seventeen HYPOTHESIS TESTING
© Scott Evans, Ph.D., and Lynne Peoples, M.S.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Research Curriculum Session III – Estimating Sample Size and Power Jim Quinn MD MS Research Director, Division of Emergency Medicine Stanford University.
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Sample Size Determination In the Context of Hypothesis Testing
Sample Size Determination
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Sample Size Determination Ziad Taib March 7, 2014.
Power and Sample Size Part II Elizabeth Garrett-Mayer, PhD Assistant Professor of Oncology & Biostatistics.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Choosing Statistical Procedures
AM Recitation 2/10/11.
HYPOTHESIS TESTING Dr. Aidah Abu Elsoud Alkaissi
Dr Mohammad Hossein Fallahzade Determining the Size of a Sample In the name of God.
Statistical Analysis Statistical Analysis
Determining Sample Size
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Inference for a Single Population Proportion (p).
1 Clinical Investigation and Outcomes Research Statistical Issues in Designing Clinical Research Marcia A. Testa, MPH, PhD Department of Biostatistics.
Background to Adaptive Design Nigel Stallard Professor of Medical Statistics Director of Health Sciences Research Institute Warwick Medical School
Comparing Two Population Means
Chapter 6 Introduction to Statistical Inference. Introduction Goal: Make statements regarding a population (or state of nature) based on a sample of measurements.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Statistics for Health Care Biostatistics. Phases of a Full Clinical Trial Phase I – the trial takes place after the development of a therapy and is designed.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Analysis of Variance (ANOVA)
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
1 Statistics in Drug Development Mark Rothmann, Ph. D.* Division of Biometrics I Food and Drug Administration * The views expressed here are those of the.
What is a non-inferiority trial, and what particular challenges do such trials present? Andrew Nunn MRC Clinical Trials Unit 20th February 2012.
Biostatistics in Practice Peter D. Christenson Biostatistician LABioMed.org /Biostat Session 4: Study Size and Power.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size and Power.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Introduction to sample size and power calculations Afshin Ostovar Bushehr University of Medical Sciences.
RDPStatistical Methods in Scientific Research - Lecture 41 Lecture 4 Sample size determination 4.1 Criteria for sample size determination 4.2 Finding the.
Issues concerning the interpretation of statistical significance tests.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size for Precision or Power.
Framework of Preferred Evaluation Methodologies for TAACCCT Impact/Outcomes Analysis Random Assignment (Experimental Design) preferred – High proportion.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Sample Size Determination
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Sample Size and Power Considerations.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Inference for a Single Population Proportion (p)
Sample Size Considerations
Sample Size Determination
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Statistical Core Didactic
Hypothesis Testing: Hypotheses
Presentation transcript:

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Sample Size and Power

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.2 Sample Size Considerations A pharmaceutical company calls and says, “We believe we have found a cure for the common cold. How many patients do I need to study to get our product approved by the FDA?”

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.3 Where to begin? N = (Total Budget / Cost per patient)? Hopefully not!

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.4 Where to begin?  Understand the research question  Learn about the application and the problem.  Learn about the disease and the medicine.  Crystal Ball  Visualize the final analysis and the statistical methods to be used.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.5 Where to begin?  Analysis determines sample size. Sample size calculations are based upon the planned method of analysis.  If you don’t know how the data will be analyzed (e.g., 2-sample t-test), then you cannot accurately estimate the sample size.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.6 Sample Size Calculation  Formulate a PRIMARY research question.  Identify: 1.A hypothesis to test (write down H 0 and H A ), or 2.A quantity to estimate (e.g., using confidence intervals)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.7 Sample Size Calculation  Determine the endpoint or outcome measure associated with the hypothesis test or quantity to be estimated.  How do we “measure” or “quantify” the responses?  Is the measure continuous, binary, or a time- to-event?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.8 Sample Size Calculation  Based upon the PRIMARY outcome  Other analyses (i.e., secondary outcomes) may be planned, but the study may not be powered to detect effects for these outcomes.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.9 Sample Size Calculation  Two strategies  Hypothesis Testing  Estimation with Precision

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.10 Sample Size Calculation Using Hypothesis Testing  By far, the most common approach.  The idea is to choose a sample size such that both of the following conditions simultaneously hold:  If the null hypothesis is true, then the probability of incorrectly rejecting is (no more than) α  If the alternative hypothesis is true, then the probability of correctly rejecting is (at least) 1- β = power.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.11 Reality H o TrueH o False Test Result Reject H o Type I error ( α ) Power (1- β) Do not reject H o 1- α Type II error ( β )

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.12 Determinants of Sample Size: Hypothesis Testing Approach  α  β  An “effect size” to detect  Estimates of variability

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.13 What is Needed to Determine the Sample-Size?  α  Up to the investigator or FDA regulation (often = 0.05)  How much type I (false positive) error can you afford?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.14 What is Needed to Determine the Sample-Size?  1- β (power)  Up to the investigator (often 80%-90%)  How much type II (false negative) error can you afford?  Not regulated by FDA

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.15 Choosing α and β  Weigh the cost of a Type I error versus a Type II error.  In early phase clinical trials, we often do not want to “miss” a significant result and thus often consider designing a study for higher power (perhaps 90%) and may consider relaxing the α error (perhaps 0.10).  In order to approve a new drug, the FDA requires significance in two Phase III trials strictly designed with α error no greater than 0.05 (Power = 1- β is often set to 80%).

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.16 Effect Size  The “minimum difference (between groups) that is clinically relevant or meaningful”.  Not readily apparent  Requires clinical input  Often difficult to agree upon  Note for noninferiority studies, we identify the “maximum irrelevant or non-meaningful difference”.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.17 Estimates of Variability  Often obtained from prior studies  Explore the literature and data from ongoing studies for estimates needed in calculations  Consider conducting a pilot study to estimate this  May need to validate this estimate later

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.18 Other Considerations  1-sample vs. 2-sample  Independent samples or paired  1-sided vs. 2-sided

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.19 Example: Cluster Headaches  A experimental drug is being compared with placebo for the treatment of cluster headaches.  The design of the study is to randomize an equal number of participants to the new drug and placebo.  The participants will be administered the drug or matching placebo. One hour later, the participants will score their pain using the visual analog scale (VAS) for pain.  A continuous measure ranging from 0 (no pain) to 10 (severe pain).

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.20 Example: Cluster Headaches  The planned analysis is a 2-sample t- test (independent groups) comparing the mean VAS score between groups, one hour after drug (or placebo) initiation  H 0 : μ 1 =μ 2 vs. H A : μ 1 ≠ μ 2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.21 Example: Cluster Headaches  It is desirable to detect differences as small as 2 units (on the VAS scale).  Using α=0.05 and β=0.80, and an assumed standard deviation (SD) of responses of 4 (in both groups), 63 participants per group (126 total) are required.  STATA Command: sampsi 0 2, sd(4) a(0.05) p(.80)  Note: you just need a difference of 2 in the first two numbers 

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.22 Example: Part 2  Let’s say that instead of measuring pain on a continuous scale using the VAS, we simply measured “response” (i.e., the headache is gone) vs. non-response.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.23 Example: Part 2  The planned analysis is a 2-sample test (independent groups) comparing the proportion of responders, one hour after drug (or placebo) initiation  H 0 : p 1 =p 2 vs. H A : p 1 ≠ p 2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.24 Example: Part 2  It is desirable to detect a difference in response rates of 25% and 50%.  Using α=0.05 and β=0.80,  STATA Command: sampsi , a(0.05) p(.80)  66 per group (132 total) w/ continuity correction   58 per group (116 total) without continuity correction

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.25 Notes for Testing Proportions  One does not need to specify a variability since it is determined from the proportion.  The required sample size for detecting a difference between 0.25 and 0.50 is different from the required sample size for detecting a difference between 0.70 and 0.95 (even though both are 0.25 differences) because the variability is different.  This is not the case for means.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.26 Caution for Testing Proportions  Some software computes the sample size for testing the null hypothesis of the equality of two proportions using a “continuity correction” while others calculate sample size without this correction.  Answers will differ slightly, although either method is acceptable.  STATA uses a continuity correction  The website does not

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.27 Sample Size Calculation Using Estimation with Precision  Not nearly as common, but equally as valid.  The idea is to estimate a parameter with enough “precision” to be meaningful.  E.g., the width of a confidence interval is narrow enough

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.28 Determinants of Sample Size: Estimation Approach  α  Estimates of variability  Precision  E.g., The (maximum) desired width of a confidence interval

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.29 Example: Evaluating a Diagnostic Examination  It is desirable to estimate the sensitivity of an examination by trained site nurses relative to an oral medicine specialist for the diagnosis of Oral Candidiasis (OC) in HIV-infected people.  Precision: It is desirable to estimate the sensitivity such that the width of a 95% confidence interval is 15%.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.30 Example: Evaluating a Diagnostic Examination  Note: sensitivity is a proportion  The (large sample) CI for a proportion is:

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.31 Example: Evaluating a Diagnostic Examination  We wish the width of the CI to be <0.15  Using an estimated proportion of 0.25 and α =0.05, we can calculate n=129.  Since sensitivity is a conditional probability, we need 129 that are OC+ as diagnosed by the oral health specialist. If the prevalence of OC is ~20%, then we would need to enroll or screen ~129/(0.20)=645.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.32 Sensitivity Analyses  Sample size calculations require assumptions and estimates.  It is prudent to investigate how sensitive the sample size estimates are to changes in these assumptions (as they may be inaccurate).  Thus, provide numbers for a range of scenarios and various combinations of parameters (e.g., for various values combinations of α, β, estimates of variance, effect sizes, etc.)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.33 Example: Sample Size Sensitivity Analyses for the Study of Cluster Headaches μ1μ1 μ2μ2 SDPower=80%Power=90%

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.34 Effects of Determinants  In general, the following increases the required sample size (with all else being equal):  Lower α  Lower β  Higher variability  Smaller effect size to detect  More precision required

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.35 Caution  In general, higher sample size implies higher power.  Does this mean that a higher sample size is always better?  Not necessarily. Studies can be very costly. It is wasteful to power studies to detect between-group differences that are clinically irrelevant.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.36 Sample Size Adjustments  Complications (e.g., loss-to-follow-up, poor adherence, etc.) during clinical trials can impact study power.  This may be less of a factor in lab experiments.  Expect these complications and plan for them BEFORE the study begins.  Adjust the sample size estimates to account for these complications.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.37 Complications that Decrease Power  Missing data  Poor Adherence  Multiple tests  Unequal group sizes  Use of nonparametric testing (vs. parametric)  Noninferiority or equivalence trials (vs. superiority trials)  Inadvertent enrollment of ineligible subjects or subjects that cannot respond

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.38 Adjustment for Lost-to-Follow-up  Loss-to-Follow-Up (LFU) refers to when a participants endpoint status is not available (missing data).  If one assumes that the LFU is non-informative or ignorable (i.e., random and not related to treatment), then a simple sample size adjustment can be made.  This is a very strong assumption as LFU is often associated with treatment. The assumption is further difficult to validate.  Researchers need to consider the potential bias of examining only subjects with non-missing data.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.39 Adjustment for Lost-to-Follow-up  Calculate the sample size N.  Let x=proportion expected to be lost-to-follow- up.  N adj =N/(1-x)  Note: no LFU adjustment is necessary if you plan to impute missing values. However, if you use imputation, an adjustment for a “dilution effect” may be warranted.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.40 Adjustment for Poor Adherence  Adjustment for the “dilution effect” due to poor adherence or the inclusion (perhaps inadvertently) of subjects that cannot respond:  Calculate the sample size N.  Let x=proportion expected to be non-adherent.  N adj =N/(1-x) 2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.41 Inflation Factor for Non-adherence Proportion non- Adherent Inflation Factor

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.42 Adjustment for Unequal Allocation  When comparing groups, power is maximized when groups sizes are equal (with all else being equal)  There may be other reasons however, to have some group sizes larger than others  E.g., having more people on an experimental therapy (rather than placebo) to obtain more safety information of the product

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.43 Adjustment for Unequal Allocation  Adjustment for unequal allocation in two groups:  Let Q E and Q C be the sample fractions such that Q E +Q C =1.  Note power is optimized when Q E =Q C =0.5  Calculate sample size N bal for equal sample sizes (i.e., Q E =Q C =0.5)  N unbal =N bal ((Q E -1 +Q C -1 )/4)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.44 Adjustment for Nonparametric Testing  Most sample-size calculations are performed expecting use of parametric methods (e.g., t- test).  This is often done because formulas (and software) for these methods are readily available  However, parametric assumptions (e.g., normality) do not always hold.  Thus nonparametric methods may be required.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.45 Adjustment for Nonparametric Testing  Pitman Efficiency  Applicable for 1 and 2 sample t-tests  Method  Calculate sample size N par.  N nonpar = N par /(0.864)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.46 Example: Cluster Headaches  Recall the cluster headache example in which the required sample size was 126 (total) for detecting a 2 unit (VAS scale) difference in means.  If we expect 10% of the participants to be non-adherent then an appropriate inflation is needed  126/(1-0.1) 2 =156  If we further expect that we will have to perform a nonparametric test (instead of a t-test) due to non- normality, then further inflation is required:  156/(0.864)=181  Round to 182 to have an equal number (81) in each group

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.47 Adjustment: Noninferiority/Equivalence Studies  Calculate sample size for standard superiority trial but reverse the roles of α and β.  Works for large sample binary and continuous data.  Does not work for time-to-event data.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.48 More Adjustments?  Adjustments are needed if:  You plan interim analyses  Group sequential designs  You have more than one primary test to be conducted  Multiple comparison adjustments  E.g., Bonferroni (if 2 tests or comparisons are to be made, then power each at α/2.  Additional adjustments may be needed for stratification, blocking, or matching.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.49 Sample Size Re-estimation  Hot Topic in clinical trials  Re-estimating sample size based on interim data  Complicated  Must be done carefully to maintain scientific integrity and blinding.