# How Much Data Do I Need? Power & Sample Size for Students t test Prof. Tom Willemain 3/2/20141T. R. Willemain.

## Presentation on theme: "How Much Data Do I Need? Power & Sample Size for Students t test Prof. Tom Willemain 3/2/20141T. R. Willemain."— Presentation transcript:

How Much Data Do I Need? Power & Sample Size for Students t test Prof. Tom Willemain 3/2/20141T. R. Willemain

Sample Size Calculations Complicated with Students t because the critical value of t is a function of the unknown sample size. R has function power.t.test() to calculate sample size if can assume i.i.d. Normal data with equal variances Complications – Deciding on acceptable Type I (false positive) and Type II (false negative) error rates – Deciding on the size of a shift in mean that would be of interest – Estimating the variances in the two groups – Deciding what to do if variances are unequal – Deciding what to do if data do not have a Normal distribution 3/2/2014T. R. Willemain2

3/2/2014T. R. Willemain3 # power.ttest.R # calculate sample size for required power # note: assumes iid Normal data w/equal variances #power.t.test(n = NULL, delta = NULL, sd = 1, sig.level = 0.05, # power = NULL, # type = c("two.sample", "one.sample", "paired"), # alternative = c("two.sided", "one.sided"), # strict = FALSE) # initialize rm(list=ls()) # enter parameters describing your scenario delta0=0.5 # desired difference in means sigma0=1.5 # estimated std dev of data in each group (assumed equal) alpha= 0.05 # type I error probability you can tolerate beta=0.05 # type II error probability you can tolerate #data.type="one.sample" #data.type="paired" data.type="two.sample" #alternative.type="two.sided" alternative.type="one.sided" # compute required sample size n=power.t.test(n = NULL, delta = delta0, sd =sigma0, sig.level = alpha, power = 1-beta,type = data.type, alternative = alternative.type, strict = FALSE) # show results print(n )

Output of power.ttest.R 3/2/2014T. R. Willemain4 > print(n) Two-sample t test power calculation n = 195.4794 delta = 0.5 sd = 1.5 sig.level = 0.05 power = 0.95 alternative = one.sided NOTE: n is number in *each* group

Dealing with the Complications Deciding on acceptable Type I and Type II error rates – Context sensitive; conventional choices use α and β 0.05 or 0.01 Deciding on the size of a shift in mean – Very context sensitive; 10% improvement? Estimating the variances in the two groups – Either get pilot samples or make guestimates Deciding what to do if variances are unequal – See Monte Carlo code t.power.R on next slide Deciding what to do if data do not have a Normal distribution – See Monte Carlo code; might also use bootstrap if have pilot samples 3/2/2014T. R. Willemain5

Output of t.power.R 3/2/2014T. R. Willemain6 Here, could substitute some other distributions Here, plug in different standard deviations

Similar presentations