# Effect Size and Power.

## Presentation on theme: "Effect Size and Power."— Presentation transcript:

Effect Size and Power

Effect Size and Power Two things mentioned previously:
P-values are heavily influenced by sample size (n) Statistics Commandment #1: P-values are silent on the strength of the relationship between two variables Effect size is what tells you about this, and we will discuss this today, in more detail Don’t forget, if you haven’t already, read Cohen’s (1992) Power Primer It’s only five pages long, simply-worded, and the best article in statistics you’ll ever read

Effect Size and Power P-values are influence heavily by n So heavily influenced, in fact, that with enough people anything is significant Ex: Data with two samples, and N=10 Group 1 mean = 6, s = 3.16 Group 2 mean = 7, s = 3.16 t = -.5, p = .63  We would fail to reject Ho 2 3 4 5 6 7 8 9 10 11

Effect Size and Power Take same data, but multiply Nx20 (N = 200)
Group 1 mean still = 6, s still = 3.16 Group 2 mean still = 7, s still = 3.16 But now t = -2.46, p = .02  We would reject Ho 2 3 4 5 6 7 8 9 10 11 Etc…

Effect Size and Power As I said before, with enough n, anything is significant Because p-values don’t say anything about the size of your effect, you can have two groups that are almost identical (like in our example) that your statistics say are significant P-values just say how likely it is that if you took another sample, that you’d get the same result – the results from big samples are stable, as we’d expect

Effect Size and Power Therefore, we need something to report in addition to p-values that are less influenced by n, and can say something about the size of our IV’s effect In the previous example, we have a low p-value, but our IV had little effect, because both of our groups (both with it and without it) had almost the same mean score Jacob Cohen to the rescue! Cohen and others have been pointing out this flaw in exclusively using p-based statistics for decades and psychologists and medical research are only beginning to catch on – most research still only reports p-values

Effect Size and Power Cohen (and others) championed the use of Effect Size statistics that provide us with this information, and are not influenced by sample size Effect Size: the strength of the effect that our IV had on our DV There is no one formula for effect size, depending on your data, there are many different formulas, and many different statistics (see the Cohen article) – they all take the general form

Effect Size and Power Ex. The effect size estimate for the Independent-Samples T-Test is: This looks a lot like our formula for z, and is interpreted similarly D-hat = the number of standard deviations mean1 is from mean2 – just like z was interpreted as the number of standard deviations our score fell from the mean

Effect Size and Power Interpreting Effect Size:
How do we know when our effect size is large? 1. Prior Research – if previous research investigating an educational intervention for low-income kids only increases their grades in school by .5 standard deviations and your does so by 1 s.d., you can say this is a large effect (~twice as large, to be exact) 2. Theoretical Prediction – if we’re developing a treatment for Borderline Personality Disorder, theory behind this disorder says that it’s stable across time and therefore difficult to treat, so we may only look for a medium effect size before we declare success

Effect Size and Power Interpreting Effect Size:
How do we know when our effect size is large? 3. Practical Considerations – if our treatment has the potential to benefit a lot of people inexpensively, even if it only helps a little (i.e. a small effect), this may be significant I.e. the average effect size for using aspirin to treat heart disease is small, but since it is inexpensive and easily implemented, and can therefore help many people (even if only a little), this is an important finding Fun Fact – the GRE predicts GPA in graduate school in psychology at an effect size of only r = .15 (which is small), but is still used because there are no better standardized tests available

Effect Size and Power Interpreting Effect Size:
How do we know when our effect size is large? 4. Tradition/Convention – when your research is novel and exploratory in nature (i.e. there is little prior research or theory to guide your expectations), we need an alternative to these methods Cohen has devised standard conventions for large, medium, and small effects for the various effect size statistics (see the Cohen article) However, what is large for one effect size statistics IS NOT NECESSARILY large for another Ex. r = .5 corresponds to a large effect size, but d = .5 only corresponds to a medium effect

Effect Size and Power Take Home Messages:
1. Interpreting effect size statistics requires detailed knowledge about your experiment Without any knowledge of how an effect size statistic was obtained, if someone asks: “Is an r = .25 a large effect?”, your answer should be: “It depends…”. 2. When reporting effect size, you CANNOT say: “My effect size was .05, and so was large”, because different effect size statistics have different conventions for small to large values Even David Barlow, a world-renowned expert on the treatment of anxiety disorders in his book The Clinical Handbook of Psychological Disorders made this mistake

Effect Size and Power Just like with too large a sample anything is significant, with too small a sample nothing is significant This refers to the probability of a Type II Error (β), incorrectly failing to reject Ho (AKA rejecting H1) How do we determine what sample size is therefore neither too large, nor too small?

Effect Size and Power How do we maximize power?
We try to maximize power, which is the reverse of a Type II Error Type II Error = incorrectly rejecting H1 ( when it is true) ; Power = correctly rejecting H1 (when it is false) How do we maximize power? 1. Increase Type I Error (α) This is problematic for obvious reasons – we don’t want to decrease making one type of error for another if we can help it

Effect Size and Power How do we maximize power?
2. Increase Effect Size We accomplish this by trying to make our IV as potent as possible, or choose a weak control group I.e. Comparing our treatment to an alternative treatment will result in a lower effect size than if we compare it to no treatment 3. Increase n or decrease s Remember: in our statistical tests we are dividing by the standard error (s/√n) – decreasing s makes this number smaller, as does increasing n – dividing by a smaller number gives us a larger value of z or t, which results in an increased chance of rejecting Ho

Effect Size and Power What is good power?
Statistical convention says that power = .8 is a good value that minimizes both Type I and Type II Error Before we conduct our experiment, i.e. a priori, we need to do what is called a Power Analysis that tells us what sample size will give us our needed power You can download a program called G*Power from the internet that does these calculations for you – you type in the kind of test you’re doing (remember how tests can be more or less “powerful”), your alpha, the power you want, and the effect size you expect, and it give you the sample size you’d need Other programs also do this, like Power and Precision, but G*Power is free Find it at:

Effect Size and Power You can also do the calculations by hand (see the textbook) However, understanding the concept of effect size and power is more important than knowing how to calculate it by hand, and since I don’t want to overwhelm you guys, you won’t be tested on these calculations (Sections in the text)