# Sample Size And Power I Jean B. Nachega, MD, PhD Department of Medicine & Centre for Infectious Diseases Stellenbosch University

## Presentation on theme: "Sample Size And Power I Jean B. Nachega, MD, PhD Department of Medicine & Centre for Infectious Diseases Stellenbosch University"— Presentation transcript:

Sample Size And Power I Jean B. Nachega, MD, PhD Department of Medicine & Centre for Infectious Diseases Stellenbosch University jnachega@sun.ac.za

Sample Size And Power Warren Browner and Stephen Hulley  The ingredients for sample size planning, and how to design them  An example, with strategies for minimizing sample size

Sampling and Inference  A sample is designed to represent a larger population  Therefore, findings in the sample allow inferences about events in the population  Problem: what if the inferences are wrong? Finding something in the sample that isn’t “real” in the population Missing something that is “real”

Inference Use a random sample to learn something about a larger population

SUMMARY OF HOW RESEARCH WORKS Hulley S, Cummings S., Newman T et al 2001 ACTUAL STUDY Actual Subjects Actual Measurements FINDINGS IN THE STUDY RESEARCH QUESTION Target population Phenomena of interest TRUTH IN THE UNIVERSE STUDY PLAN Intended Sample Intended Variables TRUTH IN THE STUDY DesignInference Implement Random + System error Random + System error

Parameter Statistic Mean: Standard deviation: Proportion: s X    estimates from sample from entire population p

How Do We Generalize? Population Sample generalizeback generalizeback

Inference  Two ways to make inference Estimation of parameters * Point Estimation (  X or p) * Intervals Estimation Hypothesis Testing

Preventing Wrong Inferences  Difficult when caused by systematic error (bias)  Easy when caused by Random error (chance) Solution: increase sample size Problem: cost, feasibility Goldilocks solution: a sample size that is big enough but not too big

Mean, , is unknown PopulationPoint estimate I am 95% confident that  is between 40 & 60 Mean  X = 50 Sample Interval estimate

Parameter = Statistic ± Its Error

Sampling Distribution  X or P

Standard Error SE (Mean) = S n SE (p) = p(1-p) n Quantitative Variable Qualitative Variable

95% Samples Confidence Interval X _  X - 1.96 SE  X + 1.96 SE  SE Z-axis 1 - α α/2

95% Samples Confidence Interval SE  p p + 1.96 SE p - 1.96 SE Z-axis 1 - α α/2

Interpretation of CI Probabilistic In repeated sampling 100(1-  )% of all intervals around sample means will in the long run include  Practical We are 100(1-  )% confident that the single computed CI contains 

Example (Sample size≥30) An epidemiologist studied the blood glucose level of a random sample of 100 patients. The mean was 170, with a SD of 10. SE = 10/10 = 1 Then CI:  = 170 + 1.96  1 168.04   ≥ 171.96 95 %  =  X + Z  SE

18 Study design Investigator assigns exposures? Experimental study Random allocation? Randomised controlled trial Observational study Comparison group? Analytical study Descriptive study Direction? Cohort study Case- control study Cross- sectional study Yes No Exposure -> outcomeExposure <- outcomeExposure & outcome at same time Non- randomised controlled trial

Ingredients For Planning Sample Size in an Analytic Study  Hypothesis Null and alternative One-sided vs two-sided  Statistical test Based on type of predictor and outcomes variables in the hypothesis  Effect size (and its variance, if apply)  Power and alpha

An assumption about the population parameter. I assume the mean SBP of participants is 120 mmHg What is a Hypothesis?

Research Hypothesis  A clear statement of what you are studying  Simple:one predictor, one outcome  Specific: who, what, when, where  Stated:in advance

Research Hypothesis  In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have a lower 1-year mortality than those randomly assigned to placebo.

The Null Hypothesis  There’s nothing going on.  Purpose in life: to be rejected in favor of its alternative.  In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have the same 1-year mortality as those randomly assigned to placebo.

 H 0 Null Hypothesis states the Assumption to be tested e.g. SBP of participants = 120 (H 0 :  120).  H 1 Alternative Hypothesis is the opposite of the null hypothesis ( SBP of participants ≠ 120 (H 1 :  ≠ 120). It may or may not be accepted and it is the hypothesis that is believed to be true by the researcher Null & Alternative Hypotheses

 Defines unlikely values of sample statistic if null hypothesis is true. Called rejection region of sampling distribution  Typical values are 0.01, 0.05  Selected by the Researcher at the Start  Provides the Critical Value(s) of the Test Level of Significance, 

Level of Significance, a and the Rejection Region 0  Critical Value(s) Rejection Regions

 True Value of Population Parameter Increases When Difference Between Hypothesized Parameter & True Value Decreases  Significance Level  Increases When Decreases  Population Standard Deviation  Increases When   Increases  Sample Size n Increases When n Decreases Factors Increasing Type II Error      n β  d

 Probability of Obtaining a Test Statistic More Extreme  or ) than Actual Sample Value Given H 0 Is True  Called Observed Level of Significance  Used to Make Rejection Decision If p value  Do Not Reject H 0 If p value <, Reject H 0 p Value Test

What’s This All About?  A long time ago, statisticians figured out the probability that a sample of a given size would “find something” even if there were nothing going on in the population.

This means that...  After a study, we can determine the likelihood that whatever we found in our sample could have occurred by chance... Even if nothing was going on in the population (i.e., the null hypothesis was true)--a “Type I error” If this is very unlikely (say < 1 in 20) we reject the null hypothesis in favor of the alternative hypothesis; we call the finding statistically significant (P <.05)

H 0 : Innocent Jury Trial Hypothesis Test Actual Situation Verdict InnocentGuilty Decision H 0 TrueH 0 False Innocent CorrectError Accept H 0 1 -  Type II Error (  ) Guilty Error Correct H 0 Type I Error (  ) Power (1 -  ) Jury Trial Example False Negative False Positive Reject

Two-sided Alternative Hypothesis  In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a different 1-year mortality than those randomly assigned to placebo.

Two One-sided Alternative Hypotheses  Side A: In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a higher 1-year mortality than those randomly assigned to placebo.  Side B: In patients with early ALS seen at UCSF in 2007, those randomly assigned to be treated with newmol will have a lower 1-year mortality than those randomly assigned to placebo.

If The Null Hypothesis Is True  By chance alone, each of the two one- sided alternative hypotheses is... Possible Equally likely Wrong  Thus a two-sided alternative hypothesis has twice the likelihood of happening by chance alone

35 Measurements Scales in Clinical Research CharacteristicExample Descriptive Stats Info content Categorical (Nominal or Dichotomous) No natural Order Sex; blood; Vital status % Lower Categorical (Ordinal) Ordered Categories Degree of pain Cancer stage % + medians Intermediate Continuous (Interval) Ranked with specified Intervals Weight; Hb; Nb of Cigarettes; Outpatient attendance The above + means; SD, medians Higher

Next Ingredient: Statistical Test (Types of Variable)  The statistical test determines how the sample size will be calculated  The type of predictor and outcome variable determine which statistical test will be used to analyze the data Both dichotomous: Chi square One dichotomous, one “continuous”: t test Both “continuous”: correlation coeff or t test

Statistical Test (Types of Variable)

Statistical Tests Used in Bivariate Analysis

Statistical Test (Types of Variable)  ALS study Predictor: newmol vs placebo Outcome: % dead  Both are dichotomous Chi square test

Next Ingredient: Effect Sizes (dichotomous variables)  How big an effect you anticipate seeing  Newmol halves mortality newmol = 5%, Placebo = 10%

Penultimate Ingredient: Power  The chance of finding something in your sample if it’s really going on in the population (avoiding a Type II error) “Something” = the effect size (or greater)  Usually set at 80% or 90%  = (1 - beta)

…and the Final Ingredient: Alpha  The chance of finding something in your sample if there’s nothing going on in the population.

Alpha Explained  The level of statistical significance (ie, the p-value that will be considered significant)  The pre-set maximum chance of finding something, if it really isn’t there.  Usually set at 0.05.  May be one-sided or two-sided.

Sidedness Of Alpha  With a two-sided alternative hypothesis, you have two chances of finding something that isn’t really there: One (equal) chance for each side.  So a one-sided alpha of 0.05 corresponds to a two-sided alpha of 0.10.

Tools to Calculate Sample Size  Formulae General formulae: these can be complex Quick formulae: for particular power and significance levels and specified tests  Special Tables for different tests  Altman’s Nomogram  Computer Software

SAMPLE SIZE: AN EXAMPLE  Null hypothesis: In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have the same 1-year mortality as those randomly assigned to placebo.  Two-sided alternative hypothesis  Dichotomous predictor and outcome  Effect size: 10% mortality + 5%  Power, alpha:90%, 0.05 (two-sided)

THE SAMPLE SIZE IS…  Table 6.B  Smaller of P1 and P2 = 0.05; power of 90%; alpha of 0.05 (two-sided)  Difference = 0.05 381 473 620  This is per group

Sample Size Reduction Strategy #1: Statistical Manipulation  Use a lower power  Use a one-sided alpha Power of 80% One-sided alpha of 0.05

The New Sample Size Is…  Table 6.B  Smaller of P1 and P2 = 0.05; power of 80%; alpha of 0.05 (one-sided)  Difference = 0.05 381 473 620  This is also per group

SS Reduction Strategy #2: Use A More Common Outcome  Change from 1-year mortality to 2-year mortality or loss of independent living  Placebo:40%  Newmol:20%

The New Sample Size Is…  Table 6.B  Smaller of P1 and P2 = 0.20; power of 80%; alpha of 0.05 (two-sided)  Difference = 0.20 74 91 118

SS Reduction Strategy #3: Use A Continuous Outcome  Change “mortality” to “muscle strength”  NOTE: Big change in research question and research hypothesis.  New null hypothesis: In patients with early ALS seen at UCT in 2007, those randomly assigned to be treated with newmol will have the same grip strength at the end of six months as those treated with placebo.  Two-sided alternative hypothesis

Estimate The Mean And Variability Of Grip Strength  Patients with untreated ALS have a (mean ± SD) grip strength of 20 ± 10 kg after 6 months of disease  Newmol may improve that by 25%

Then  Grip strength Placebo:20 kg Newmol:25 kg (25% more)  Effect size = 5 kg SD = 10 kg  Standardized effect size: E/S = 5/10 = 0.5

The New Sample Size Is...  Table 6.A  E/S = 0.5  ß = 0.20, Alpha (two-sided) = 0.05  N = 64 per group

Altman’s Nomogram

Ss Reduction Strategy #4: Use A More Precise Outcome  Buy a better instrument to measure grip strength  Use a well-defined protocol  Repeat measurements on two consecutive days  Reduce SD from 10 kg to 8 kg

The New Sample Size Is...  New E/S = 5 kg/8 kg= 0.625  ß = 0.20, Alpha (two-sided) = 0.05  N = about 45 per group  This helped quite a bit.

SS Reduction Strategy #5: Use Paired Measurements  Most of the variability in grip strength at the end of the study is likely to be due to differences between subjects in grip strength at the beginning of the study.  Switch the outcome to change in grip strength from the beginning to the end of the study.

Paired Measurements  Each subject contributes a pair of measurements: (before, after)  The outcome variable is the difference between that pair for each subject.  The SD of the change in a measurement is usually < than the SD of the measurement  SD of change in grip strength is 5 kg  New standardized effect size = 5/5 = 1.0

The New Sample Size Is...  E/S = 1.0  ß = 0.20, Alpha (two-sided) = 0.05  N = 17 per group  We now have a potentially do-able study, albeit one that is very different from the original aim.

SS Techniques for Descriptive Studies  Such studies (including studies of diagnostic tests) do not have a predictor and outcomes variables, nor do they compare groups  Therefore the concept of power and null and alternative hypotheses do not apply  The investigator calculate calculates descriptive stat such as means and proportions.  Descriptive studies commonly report confidence intervals, a range of values about the sample mean or proportion  A confidence level is a measure of precision of a sample estimate (e.g. 95%, 99%)

One Group SS Techniques for Descriptive Studies  Based on precision of estimation: desire confidence interval of a certain width Requires:  Select the Confidence level (or =significance level)  Specify desired precision (total width) of the confidence interval [w = 2d]  estimate of Standard Deviation or proportion

One Group SS- Continuous  For a continuous variable, the 95% CI is  Need to specify d (1/2 the width of the CI)  Need to estimate  2

Example: One Group SS- Continuous  We are interested in estimating the mean age at cancer diagnosis for a certain group of patients. Suppose we would like to estimate the mean age within ± 2.5 years (95% CI of width 5 years). Suppose that we estimate the population’s standard deviation as 12 years.

Example: One Group SS- Continuous  We would need a sample size of 89 patients in order to estimate the mean age at diagnosis to within ± 2.5 years

One Group SS - Dichotomous  For a dichotomous variable, the 95% CI is  Need to specify d (1/2 the width of the CI)  Need to estimate p

 We would need a sample size of 683 patients in order to estimate the proportion symptom-free to within ± 3% Example: One Group SS- Dichotomous

 We would need a sample size of 142 patients in each group in order to detect a 5 mm difference in average blood pressure Example: Two Group SS- Continuous

 What sample size is needed to estimate the proportion symptom- free to within ± 5%?  What sample size is needed to estimate the proportion symptom- free if p is unknown? Example: One Group SS- Dichotomous

Other considerations and special issues  Dropouts: If the investigator estimates that 20% of her sample will be lost to follow-up, then the sample size should be increased by a factor of (1:[1-0.20])  Survival analysis will use the proportion of subjects (dichotomous variable) still alive at each point in time and sample size will be estimated using chi-squared test  Sample size calculation in clustered samples are more complex and require the assistance of a statistician

The Bottom Line  Sample size estimation is an integral part of study planning and grant writing  Almost never the last thing you do  More often, one of your first tasks  Consult always a statistician, especially if a grant proposal that involves substantial costs, is being submitted for funding

SAMPLE SIZE PLANNING: REVIEW OF INGREDIENTS  Looking for something in a sample Hypotheses (null and alternative) Will you be able to...  Know it’s there in the population if you find it in your sample ( avoid a Type I error) Test of significance, alpha  Find it in your sample if it’s there in the population (avoid a type II error)? Effect size, power

Acknowledgments and Suggested Reading Authors By Stephen B. Hulley, M.D., MPH Steven R. Cummings, M.D. Warren S. Browner, M.D., MPH Deborah Grady, M.D., MPH Norman Hearst, M.D., MPH Thomas B. Neuman, M.D., MPH

Similar presentations