Effect Size and Statistical Power Analysis in Behavioral and Educational Research Effect size 1 (P. Onghena) 09.00-10.30 a.m. Effect size 2 (W. Van den.

Slides:



Advertisements
Similar presentations
ClAss F3P Indoor Aerobatic Power Model aIRCRafts Schedule F3P - AP.
Advertisements

American Society Chapter 07.
(陕旅版)四年级英语下册课件 Lesson 1 What time is it?.
Three Steps To Graphical Analysis Graphs are one of the tools of the trade for Physicists. There are three basic things you can do with a graph. 1. You.
ThemeGallery PowerTemplate
Getting Started Copyright © 2007 Allyn & Bacon Mayers Personality: A Systems Approach Part 1: Exploring PersonalityChapter 3: Theories of Personality Perspectives.
1_Panel Production. 380 pannelli 45 giorni di produzione = 8.4 pannelli/day.
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
Supplementary Figure 1 >15 alleles Sens: 44% 1-Spec: 19% alleles Sens: 54% 1-Spec: 66%
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. *See PowerPoint Lecture Outline for a complete, ready-made.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 116.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 107.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 40.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 28.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 44.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 29.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 101.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 38.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 58.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 112.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 75.
Chapter 1 Image Slides Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Measurements and Their Uncertainty 3.1
AS. 02/03 Finding fractions of a quantity AS. 02/03.
1  1 =.
2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt Time Money AdditionSubtraction.
Welcome to Spring Garden. Today! Schedule Overview of classes Advisement Core Lab SGMS Privilege System Grading Practices Questions???? Tour.
Written by: Tom Watt June 2008 Key Stage 2 Scottish Level E Quiz.
Tenths and Hundredths.
What Time is it?. When When is your birthday? What What year were you born? How How many days are there in a week? many seconds in a minute? many minutes.
Grade D Number – Multiplying Decimals 1. O.8 x x x x x x x x x
What Time Is It? Lesson by Mrs. Moody, FLE. M1M2 Students will develop an understanding of the measurement of time. a Tell time to the nearest hour and.
Measuring Time. Learning Goals To measure time, we start counting units of time when the activity begins and stop counting when the activity ends. The.
Ms. Leonards Class October 27, What a Clock Looks Like - Clocks have two hands, a big hand and a small hand. - The big hand tells time in minutes.
Break Time Remaining 10:00.
This module: Telling the time
Find the total of 5 hours 10 minutes 3 hours 23 minutes 6 hours 39 minutes Click for solution + hours minutes7214 More than 60 minutes? YES: convert 60.
Kronos Timecard Pay Rounding Tips.
Radha Krishna outfits Janmashtami Special
Equations, Tables and Graphs Graphing Activity. Warm UP xy InputOutput Determine if the following relations are functions.
Money Math Review.
Precedence Diagram Technique Precedence Networks Critical Path Analysis.
Second GML relay. 2 nd GML relay Netherlands Society for Earth Observation and Geo-informatics TDN, Emmen NL 13 December 2002.
15. Oktober Oktober Oktober 2012.
TEACHER. J. ENRIQUE TORRES DEL CASTILLO.
Created by Mr. Lafferty Maths Dept.
© 2012 IBM Corporation 1 IBM PureData for Analytics Clustering three ways with Open Source R.
We are learning how to read the 24 hour clock
S elçuk N as SELÇUK NAS DOKUZ EYLUL UNIVERSITY SCHOOL OF MARITIME BUSINESS AND MANAGEMENT DEPARTMENT OF DECK CURRENT TRIANGLE.
Fishing Market Sector Meeting February 3 rd, 2009 Tourism Vancouver Island.
Comparison of X-ray diffraction patterns of La 2 CuO 4+   from different crystals at room temperature Pia Jensen.
UNIT 2: SOLVING EQUATIONS AND INEQUALITIES SOLVE EACH OF THE FOLLOWING EQUATIONS FOR y. # x + 5 y = x 5 y = 2 x y = 2 x y.
Special Shortcuts for and Triangles
Network Security Workshop Dhaka, Bangladesh 09 – 11 November 2013 Proudly Supported by:
High School Softball Strike Zone
Mail-merge and Contact Log Shaun Elliott – Business Consultant Enhance your knowledge, improve your organisation.
My Holiday Economic Choices! Using your budget, choose an item to buy for each significant person in your life. Explain any opportunity costs.
: 3 00.
5 minutes.
Types of clocks. Types of clocks Sand clock or Hourglass clock.
THE QUESTIONS THAT NO ONE ASKS Social Entrepreneurship Conference Luis Pareras.
Visions of Australia – Regional Exhibition Touring Fund Applicant organisation Exhibition title Exhibition Sample Support Material Instructions 1) Please.
Bell Schedules Club Time is available from 8:05-8:20  1 st 8:20 – 9:15  2 nd 9:20 – 10:10  3 rd 10:15 – 11:05  4 th 11:10 – 12:50 A(11:10)
Clock will move after 1 minute
Bottoms Up Factoring. Start with the X-box 3-9 Product Sum
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Presentation transcript:

Effect Size and Statistical Power Analysis in Behavioral and Educational Research Effect size 1 (P. Onghena) a.m. Effect size 2 (W. Van den Noortgate) a.m. Power 1 (I. Van Mechelen) p.m. Power 2 (P. Onghena) (A-N) / (O-Z)

SIGNIFICANCE TESTING CRISIS Carver, R. P. (1993). The case against statistical significance testing Cohen, J. (1994). The earth is round (p <.05). Falk, R., & Greenbaum, C. W. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. Hunter, J. E. (1997). Needed: A ban on the significance test.

CHILDHOOD TRAUMATA Furious parental conflicts Karl Pearson versus Ronald Fisher Ronald Fisher versus Jerzy Neyman (Egon Pearson) – see Box (1978), Gigerenzer et al. (1990), Oakes (1986) Morrison, D. R., & Henkel, R. E. (Eds.). (1970). The significance test controversy: A reader.

POSSIBILITY FOR GROWTH APA Task Force on Statistical Inference 1999 American Psychologist article: Wilkinson & the Task Force 2001 Publication Manual (5th ed.) Editorial boards of flagship journals: Journal of Consulting & Clinical Psychology, Journal of Counseling and Development, Exceptional Children, Journal of Learning Disabilities,…

GUIDELINES Power and sample size. Provide information on sample size and the process that led to sample size decisions. Document the effect sizes, sampling and measurement assumptions, as well as analytic procedures used in power calculations.

Because power computations are most meaningful when done before data are collected and examined, it is important to show how effect-size estimates have been derived from previous research and theory in order to dispel suspicions that they might have been taken from data used in the study or, even worse, constructed to justify a particular sample size. Once the study is analyzed, confidence intervals replace calculated power in describing results.

GUIDELINES Hypothesis tests. It is hard to imagine a situation in which a dichotomous accept-reject decision is better than reporting an actual p value or, better still, a confidence interval. Never use the unfortunate expression "accept the null hypothesis." Always provide some effect-size estimate when reporting a p value.

GUIDELINES Effect sizes. Always present effect sizes for primary outcomes. If the units of measurement are meaningful on a practical level (e.g., number of cigarettes smoked per day), then we usually prefer an unstandardized measure (regression coefficient or mean difference) to a standardized measure (r or d). It helps to add brief comments that place these effect sizes in a practical and theoretical context.

For a simple, general purpose display of the practical meaning of an effect size, see Rosenthal and Rubin (1982). Consult Rosenthal and Rubin (1994) for information on the use of “counternull intervals” for effect sizes, as alternatives to confidence intervals.

GUIDELINES Interval estimates. Interval estimates should be given for any effect sizes involving principal outcomes. Provide intervals for correlations and other coefficients of association or variation whenever possible.

EFFECT SIZE: IMPORTANCE For power analysis (Cohen, 1969) For meta-analysis (Glass, 1976) For descriptive statistics Test of Significance = Size of Effect × Size of Study Rosenthal, 1991

EFFECT SIZE: WHAT THE HELL…? Cohen (1969): “By the above route, it can now readily made clear that when the null hypothesis is false, it is false to some degree, i.e., the effect size (ES) is some specific nonzero value in the population.” (p. 10)

EFFECT SIZE: WHAT THE HELL…? Use of the tables for significance testing Cohen (1969): “Accordingly, we refine our ES index, d, so that its elements are sample results, rather than population parameters, and call it d s.” (p. 64)

EFFECT SIZE: WHAT THE HELL…?

Glass (1976): uses d s in meta-analysis but only uses S of the control group in the denominator. Hedges (1981), Hedges and Olkin (1985) d s is called g (with reference to Gene Glass)  Hedges’s g Hedges (1981), Hedges and Olkin (1985) confusion: an approximately unbiased estimator called... d!?

EFFECT SIZE: SUMMARY COMPARISON OF TWO MEANS Cohen’s d: population value (if you use the sample as your population, then use the sample size in the denominator) Hedges’s g: sample estimator (use the degrees of freedom in the denominator) Hedges’s unbiased estimator is rarely used outside meta-analytic contexts point biserial correlation coefficient (Rosenthal, 1991)

EFFECT SIZE: EXAMPLE ExperimentalControl Sum3015 Mean63 S (  ) 1 (0.894)

EFFECT SIZE: EXAMPLE Cohen’s d = (6 – 3) /.894 = 3.35 Hedges’s g = (6 – 3) / 1 = 3 Point biserial correlation coefficient: r =.86 All kinds of transformations possible t  d  g  r

COUNTERNULL VALUE OFAN ES Tackle the misconceptions –that failure to reject the null hypothesis  ES = 0 –that finding a statistically significant p value implies an ES of important magnitude The counternull value is the nonnull magnitude of ES that is supported by exactly the same amount of evidence as is the null value of the ES. If the counternull value were taken as H 0, then the resulting p value would be the same as the obtained p for the actual H 0

COUNTERNULL VALUE OF AN ES For symmetric reference distributions ES counternull = 2ES obtained – ES null For asymmetric reference distributions –transform the ES as to have a symmetric reference distribution –calculate the counternull on the symmetric scale –transform back to obtain the counternull on the original scale Example of its use: RRR (2000)

INTERPRETING EFFECT SIZES Cohen’s heuristic values small: d = 0.20 the size of the difference between 15- and 16-year-old girls medium: d = 0.50 visible to the naked eye 14- and 18-year-old girls large: d = 0.80 grossly perceptible 13- and 18-year-old girls

INTERPRETING EFFECT SIZES Comparison with other measures small: d = 0.20 r =.10 r 2 =.01 medium:d = 0.50 r =.243 r 2 =.059 large: d = 0.80 r =.371 r 2 =.138

BINOMIAL EFFECT SIZE DISPLAY r =.32Treatment outcome ConditionImprovedNot improvedTotals Psychotherapy Control Totals

BINOMIAL EFFECT SIZE DISPLAY What is the effect on the success rate of the implementation of a certain treatment? Psychotherapy success rate:.50 + r/2 =.66 Control success rate:.50 – r/2 =.34 Notice:.66 –.34 =.32 “standardized” percentages in order for all margins to be equal

ASPIRIN’S EFFECT ON HEART ATTACK ConditionHeart attackNo heart attackTotal Aspirin Placebo Totals

ASPIRIN’S EFFECT ON HEART ATTACK: BESD ConditionHeart attackNo heart attackTotal Aspirin Placebo Totals

SMALL EFFECTS MAY BE IMPRESSIVE and vice versa (Prentice & Miller, 1992) consider the amount of variation in the independent variable consider the importance / the assumed stability of the dependent variabele

WHAT EFFECT SIZE HAS PRACTICAL SIGNIFICANCE? assess practical significance closely related to the particular problems, populations, and measures relevant to the treatment under investigation Example: community mental health study inpatient versus outpatient therapy Example: effects of school characteristics on reading achievement fifth grade pupils versus sixth grade pupils