Presentation on theme: "Sample Size And Power I Jean B. Nachega, MD, PhD Department of Medicine & Centre for Infectious Diseases Stellenbosch University"— Presentation transcript:
Sample Size And Power I Jean B. Nachega, MD, PhD Department of Medicine & Centre for Infectious Diseases Stellenbosch University firstname.lastname@example.org
Sample Size And Power Warren Browner and Stephen Hulley The ingredients for sample size planning, and how to design them An example, with strategies for minimizing sample size
Sampling and Inference A sample is designed to represent a larger population Therefore, findings in the sample allow inferences about events in the population Problem: what if the inferences are wrong? Finding something in the sample that isn’t “real” in the population Missing something that is “real”
Inference Use a random sample to learn something about a larger population
SUMMARY OF HOW RESEARCH WORKS Hulley S, Cummings S., Newman T et al 2001 ACTUAL STUDY Actual Subjects Actual Measurements FINDINGS IN THE STUDY RESEARCH QUESTION Target population Phenomena of interest TRUTH IN THE UNIVERSE STUDY PLAN Intended Sample Intended Variables TRUTH IN THE STUDY DesignInference Implement Random + System error Random + System error
Parameter Statistic Mean: Standard deviation: Proportion: s X estimates from sample from entire population p
How Do We Generalize? Population Sample generalizeback generalizeback
Inference Two ways to make inference Estimation of parameters * Point Estimation ( X or p) * Intervals Estimation Hypothesis Testing
Preventing Wrong Inferences Difficult when caused by systematic error (bias) Easy when caused by Random error (chance) Solution: increase sample size Problem: cost, feasibility Goldilocks solution: a sample size that is big enough but not too big
Mean, , is unknown PopulationPoint estimate I am 95% confident that is between 40 & 60 Mean X = 50 Sample Interval estimate
Standard Error SE (Mean) = S n SE (p) = p(1-p) n Quantitative Variable Qualitative Variable
95% Samples Confidence Interval X _ X - 1.96 SE X + 1.96 SE SE Z-axis 1 - α α/2
95% Samples Confidence Interval SE p p + 1.96 SE p - 1.96 SE Z-axis 1 - α α/2
Interpretation of CI Probabilistic In repeated sampling 100(1- )% of all intervals around sample means will in the long run include Practical We are 100(1- )% confident that the single computed CI contains
Example (Sample size≥30) An epidemiologist studied the blood glucose level of a random sample of 100 patients. The mean was 170, with a SD of 10. SE = 10/10 = 1 Then CI: = 170 + 1.96 1 168.04 ≥ 171.96 95 % = X + Z SE
18 Study design Investigator assigns exposures? Experimental study Random allocation? Randomised controlled trial Observational study Comparison group? Analytical study Descriptive study Direction? Cohort study Case- control study Cross- sectional study Yes No Exposure -> outcomeExposure <- outcomeExposure & outcome at same time Non- randomised controlled trial
Ingredients For Planning Sample Size in an Analytic Study Hypothesis Null and alternative One-sided vs two-sided Statistical test Based on type of predictor and outcomes variables in the hypothesis Effect size (and its variance, if apply) Power and alpha
An assumption about the population parameter. I assume the mean SBP of participants is 120 mmHg What is a Hypothesis?
Research Hypothesis A clear statement of what you are studying Simple:one predictor, one outcome Specific: who, what, when, where Stated:in advance
Research Hypothesis In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have a lower 1-year mortality than those randomly assigned to placebo.
The Null Hypothesis There’s nothing going on. Purpose in life: to be rejected in favor of its alternative. In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have the same 1-year mortality as those randomly assigned to placebo.
H 0 Null Hypothesis states the Assumption to be tested e.g. SBP of participants = 120 (H 0 : 120). H 1 Alternative Hypothesis is the opposite of the null hypothesis ( SBP of participants ≠ 120 (H 1 : ≠ 120). It may or may not be accepted and it is the hypothesis that is believed to be true by the researcher Null & Alternative Hypotheses
Defines unlikely values of sample statistic if null hypothesis is true. Called rejection region of sampling distribution Typical values are 0.01, 0.05 Selected by the Researcher at the Start Provides the Critical Value(s) of the Test Level of Significance,
Level of Significance, a and the Rejection Region 0 Critical Value(s) Rejection Regions
True Value of Population Parameter Increases When Difference Between Hypothesized Parameter & True Value Decreases Significance Level Increases When Decreases Population Standard Deviation Increases When Increases Sample Size n Increases When n Decreases Factors Increasing Type II Error n β d
Probability of Obtaining a Test Statistic More Extreme or ) than Actual Sample Value Given H 0 Is True Called Observed Level of Significance Used to Make Rejection Decision If p value Do Not Reject H 0 If p value <, Reject H 0 p Value Test
What’s This All About? A long time ago, statisticians figured out the probability that a sample of a given size would “find something” even if there were nothing going on in the population.
This means that... After a study, we can determine the likelihood that whatever we found in our sample could have occurred by chance... Even if nothing was going on in the population (i.e., the null hypothesis was true)--a “Type I error” If this is very unlikely (say < 1 in 20) we reject the null hypothesis in favor of the alternative hypothesis; we call the finding statistically significant (P <.05)
H 0 : Innocent Jury Trial Hypothesis Test Actual Situation Verdict InnocentGuilty Decision H 0 TrueH 0 False Innocent CorrectError Accept H 0 1 - Type II Error ( ) Guilty Error Correct H 0 Type I Error ( ) Power (1 - ) Jury Trial Example False Negative False Positive Reject
Two-sided Alternative Hypothesis In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a different 1-year mortality than those randomly assigned to placebo.
Two One-sided Alternative Hypotheses Side A: In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a higher 1-year mortality than those randomly assigned to placebo. Side B: In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a lower 1-year mortality than those randomly assigned to placebo.
If The Null Hypothesis Is True By chance alone, each of the two one- sided alternative hypotheses is... Possible Equally likely Wrong Thus a two-sided alternative hypothesis has twice the likelihood of happening by chance alone
35 Measurements Scales in Clinical Research CharacteristicExample Descriptive Stats Info content Categorical (Nominal or Dichotomous) No natural Order Sex; blood; Vital status % Lower Categorical (Ordinal) Ordered Categories Degree of pain Cancer stage % + medians Intermediate Continuous (Interval) Ranked with specified Intervals Weight; Hb; Nb of Cigarettes; Outpatient attendance The above + means; SD, medians Higher
Next Ingredient: Statistical Test (Types of Variable) The statistical test determines how the sample size will be calculated The type of predictor and outcome variable determine which statistical test will be used to analyze the data Both dichotomous: Chi square One dichotomous, one “continuous”: t test Both “continuous”: correlation coeff or t test
Statistical Test (Types of Variable) ALS study Predictor: newmol vs placebo Outcome: % dead Both are dichotomous Chi square test
Next Ingredient: Effect Sizes (dichotomous variables) How big an effect you anticipate seeing Newmol halves mortality newmol = 5%, Placebo = 10%
Penultimate Ingredient: Power The chance of finding something in your sample if it’s really going on in the population (avoiding a Type II error) “Something” = the effect size (or greater) Usually set at 80% or 90% = (1 - beta)
…and the Final Ingredient: Alpha The chance of finding something in your sample if there’s nothing going on in the population.
Alpha Explained The level of statistical significance (ie, the p-value that will be considered significant) The pre-set maximum chance of finding something, if it really isn’t there. Usually set at 0.05. May be one-sided or two-sided.
Sidedness Of Alpha With a two-sided alternative hypothesis, you have two chances of finding something that isn’t really there: One (equal) chance for each side. So a one-sided alpha of 0.05 corresponds to a two-sided alpha of 0.10.
Tools to Calculate Sample Size Formulae General formulae: these can be complex Quick formulae: for particular power and significance levels and specified tests Special Tables for different tests Altman’s Nomogram Computer Software
SAMPLE SIZE: AN EXAMPLE Null hypothesis: In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have the same 1-year mortality as those randomly assigned to placebo. Two-sided alternative hypothesis Dichotomous predictor and outcome Effect size: 10% mortality + 5% Power, alpha:90%, 0.05 (two-sided)
THE SAMPLE SIZE IS… Table 6.B Smaller of P1 and P2 = 0.05; power of 90%; alpha of 0.05 (two-sided) Difference = 0.05 381 473 620 This is per group
Sample Size Reduction Strategy #1: Statistical Manipulation Use a lower power Use a one-sided alpha Power of 80% One-sided alpha of 0.05
The New Sample Size Is… Table 6.B Smaller of P1 and P2 = 0.05; power of 80%; alpha of 0.05 (one-sided) Difference = 0.05 381 473 620 This is also per group
SS Reduction Strategy #2: Use A More Common Outcome Change from 1-year mortality to 2-year mortality or loss of independent living Placebo:40% Newmol:20%
The New Sample Size Is… Table 6.B Smaller of P1 and P2 = 0.20; power of 80%; alpha of 0.05 (two-sided) Difference = 0.20 74 91 118
SS Reduction Strategy #3: Use A Continuous Outcome Change “mortality” to “muscle strength” NOTE: Big change in research question and research hypothesis. New null hypothesis: In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have the same grip strength at the end of six months as those treated with placebo. Two-sided alternative hypothesis
Estimate The Mean And Variability Of Grip Strength Patients with untreated ALS have a (mean ± SD) grip strength of 20 ± 10 kg after 6 months of disease Newmol may improve that by 25%
Then Grip strength Placebo:20 kg Newmol:25 kg (25% more) Effect size = 5 kg SD = 10 kg Standardized effect size: E/S = 5/10 = 0.5
The New Sample Size Is... Table 6.A E/S = 0.5 ß = 0.20, Alpha (two-sided) = 0.05 N = 64 per group
Ss Reduction Strategy #4: Use A More Precise Outcome Buy a better instrument to measure grip strength Use a well-defined protocol Repeat measurements on two consecutive days Reduce SD from 10 kg to 8 kg
The New Sample Size Is... New E/S = 5 kg/8 kg= 0.625 ß = 0.20, Alpha (two-sided) = 0.05 N = about 45 per group This helped quite a bit.
SS Reduction Strategy #5: Use Paired Measurements Most of the variability in grip strength at the end of the study is likely to be due to differences between subjects in grip strength at the beginning of the study. Switch the outcome to change in grip strength from the beginning to the end of the study.
Paired Measurements Each subject contributes a pair of measurements: (before, after) The outcome variable is the difference between that pair for each subject. The SD of the change in a measurement is usually < than the SD of the measurement SD of change in grip strength is 5 kg New standardized effect size = 5/5 = 1.0
The New Sample Size Is... E/S = 1.0 ß = 0.20, Alpha (two-sided) = 0.05 N = 17 per group We now have a potentially do-able study, albeit one that is very different from the original aim.
SS Techniques for Descriptive Studies Such studies (including studies of diagnostic tests) do not have a predictor and outcomes variables, nor do they compare groups Therefore the concept of power and null and alternative hypotheses do not apply The investigator calculate calculates descriptive stat such as means and proportions. Descriptive studies commonly report confidence intervals, a range of values about the sample mean or proportion A confidence level is a measure of precision of a sample estimate (e.g. 95%, 99%)
One Group SS Techniques for Descriptive Studies Based on precision of estimation: desire confidence interval of a certain width Requires: Select the Confidence level (or =significance level) Specify desired precision (total width) of the confidence interval [w = 2d] estimate of Standard Deviation or proportion
One Group SS- Continuous For a continuous variable, the 95% CI is Need to specify d (1/2 the width of the CI) Need to estimate 2
Example: One Group SS- Continuous We are interested in estimating the mean age at cancer diagnosis for a certain group of patients. Suppose we would like to estimate the mean age within ± 2.5 years (95% CI of width 5 years). Suppose that we estimate the population’s standard deviation as 12 years.
Example: One Group SS- Continuous We would need a sample size of 89 patients in order to estimate the mean age at diagnosis to within ± 2.5 years
One Group SS - Dichotomous For a dichotomous variable, the 95% CI is Need to specify d (1/2 the width of the CI) Need to estimate p
We would need a sample size of 683 patients in order to estimate the proportion symptom-free to within ± 3% Example: One Group SS- Dichotomous
We would need a sample size of 142 patients in each group in order to detect a 5 mm difference in average blood pressure Example: Two Group SS- Continuous
What sample size is needed to estimate the proportion symptom- free to within ± 5%? What sample size is needed to estimate the proportion symptom- free if p is unknown? Example: One Group SS- Dichotomous
Other considerations and special issues Dropouts: If the investigator estimates that 20% of her sample will be lost to follow-up, then the sample size should be increased by a factor of (1:[1-0.20]) Survival analysis will use the proportion of subjects (dichotomous variable) still alive at each point in time and sample size will be estimated using chi-squared test Sample size calculation in clustered samples are more complex and require the assistance of a statistician
The Bottom Line Sample size estimation is an integral part of study planning and grant writing Almost never the last thing you do More often, one of your first tasks Consult always a statistician, especially if a grant proposal that involves substantial costs, is being submitted for funding
SAMPLE SIZE PLANNING: REVIEW OF INGREDIENTS Looking for something in a sample Hypotheses (null and alternative) Will you be able to... Know it’s there in the population if you find it in your sample ( avoid a Type I error) Test of significance, alpha Find it in your sample if it’s there in the population (avoid a type II error)? Effect size, power
Acknowledgments and Suggested Reading Authors By Stephen B. Hulley, M.D., MPH Steven R. Cummings, M.D. Warren S. Browner, M.D., MPH Deborah Grady, M.D., MPH Norman Hearst, M.D., MPH Thomas B. Neuman, M.D., MPH