Presentation is loading. Please wait.

Presentation is loading. Please wait.

Geoff Cumming: LAM, Paris 2 (Friday 11 May, 2012) Workshop: The New Statistics in Practice In this workshop I will discuss how to use the new statistics.

Similar presentations


Presentation on theme: "Geoff Cumming: LAM, Paris 2 (Friday 11 May, 2012) Workshop: The New Statistics in Practice In this workshop I will discuss how to use the new statistics."— Presentation transcript:

1 Geoff Cumming: LAM, Paris 2 (Friday 11 May, 2012) Workshop: The New Statistics in Practice In this workshop I will discuss how to use the new statistics in practice. Choice of topics will be responsive to the interests of people attending. I could consider various measures, including correlations, proportions, and the standardized effect size Cohen’s d. I could consider a range of simple experimental designs. I will also discuss meta-analysis. ESCI will serve to illustrate many of the ideas, and calculate confidence intervals in the different situations. I will consider statistical power, but will emphasize the advantages of an alternative approach to planning experiments: Precision for planning. This approach calculates the N required for our planned experiment to be likely to give a confidence interval that is not greater than some specified target length. There will be ample time for discussion, and for considering data and situations that are of particular interest to participants. 1

2 2 The New Statistics in Practice Geoff Cumming Statistical Cognition Laboratory, School of Psychological Science, La Trobe University, Melbourne, Australia LAM, Paris, Talk 2 (Workshop), 11 May 2012 THANKS TO: Claudia Fritz, and: Bruce Thompson, Sue Finch, Robert Maillardet, Ben Ong, Ross Day, Mary Omodei, Jim McLennan, Sheila Crewther, David Crewther, Melanie Murphy, Cathy Faulkner, Pav Kalinowski, Jerry Lai, Debra Hansen, Mary Castellani, Mark Halloran, Kavi Jayasinghe, Mitra Jazayeri, Matthew Page, Leslie Schachte, Anna Snell, Andrew Speirs-Bridge, Eva van der Brugge, Elizabeth Silver, Jacenta Abbott, Sarah Rostron, Amy Antcliffe, Lisa L. Harlow, Dennis Doverspike, Alan Reifman, Joseph S. Rossi, Frank L. Schmidt, Meng-Jia Wu, Fiona Fidler, Neil Thomason, Claire Layman, Gideon Polya, Debra Riegert, Andrea Zekus, Mimi Williams, Lindy Cumming © G. Cumming 2012

3 3 Lucky-Noluck, RCTs for new vs old treatment Lucky (2009) found the new showed a statistically significant advantage over the old: M (difference) = 3.61, SD = 6.97, t(42) = 2.43, p =.02. Noluck (2009) found no statistically significant difference between the two: M (difference) = 2.23, SD = 7.59, t(34) = 1.25, p =.22.  Conclusion, from NHST? (Different, equivocal, or similar?)  Conclusion, from 95% CIs? Chapter 1 Lucky (Total N = 44) Noluck (Total N = 36)

4 4 Combination by meta-analysis (MA) of the Lucky and Noluck results. The null hypothesis of no difference was rejected, p =.008. What is your conclusion? Is the new treatment effective? MA (Total N = 80) Chapter 1 Lucky (Total N = 44) Noluck (Total N = 36)

5 5 Three formats, all based on the same data: 1.Null hypothesis significance testing (NHST) 2.Confidence intervals (CI) 3.Meta-analysis (MA), to combine results of two studies. The CI and MA formats indicates that Similar is the best interpretation. (A comparison of the two studies gives p =.55, so no sign of conflict between them!) Formats matter! NHST can mislead. CIs can give better understanding and conclusions. … just some more messing with your (NHST?) mind… Lucky (Total N = 44) Noluck (Total N = 36) MA (Total N = 80) Chapter 1

6 6 Lucky-Noluck evidence (statistical cognition)  authors of articles in psychology and medical journals.  Ask them to rate: “Results of the two are broadly consistent, or similar”  Ask for comments, classify these as ‘mention NHST’ or no such mention.  Respondents who saw the CI figure: Conclude:  Even if see CIs, oftenthink in terms of NHST!  Better interpretation if avoid NHST, and think in terms of intervals! Chapter 1 Coulson, M., Healey, M., Fidler, F., & Cumming, G. (2010). Confidence intervals permit, but do not guarantee, better inference than statistical significance testing. Frontiers in Quantitative Psychology and Measurement, 1:26, 1-9. ) tinyurl.com/cisbettertinyurl.com/cisbetter Lucky (Total N = 44) Noluck (Total N = 36)

7 7 Time for a crusade?!

8 8 The Boots anti-ageing stampede!  April 2009, Queues for ‘No. 7 Protect & Perfect Intense Beauty Serum’  Media reports: “significant clinical improvement in facial wrinkles…”  J. Dermatology, online: A cosmetic ‘anti-ageing’ product improves photoaged skin: A double-blind, randomized controlled trial  “…statistically significant improvement in facial wrinkles as compared to baseline assessment (p =.013), whereas vehicle-treated skin was not significantly improved (p =.11)”  A highly critical paper in Significance, then a revised article:  “non-significant trend towards clinical improvement… (p =.10)…” Watson, R. E. B., et al. (2009). British Journal of Dermatology, 161, Chapter 2

9 9 Lucky-Noluck is everywhere! “…incorrect procedure… in which researchers conclude that effects differ when one effect is significant (p.05). We reviewed 513 … articles in Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience and found that 78 used the correct procedure and 79 used the incorrect procedure.” Nieuwenhuis, S., Forstmann, B. U., & Wagenmakers, E-J. (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nature neuroscience, 14,

10 10 Effect size Effect size, the amount of something of interest  Many ES measures are very familiar An effect size (ES) can be:  A mean, or difference between means  A percentage, or percentage change  A correlation (e.g., Pearson r)  Proportion of variance (R 2,  2,  2 …)  A standardised measure (Cohen’s d, Hedges g…)  A regression slope (b or  )  Many other things… (but NOT a p value!) Chapter 2

11 11 Types of ESs  ES in original units (e.g., mean, mean difference)  Standardised measure (e.g., Cohen’s d )—can help future MA  Units-free measure (e.g., Pearson r, , R 2 ) “Effect sizes may be expressed in the original units (e.g., the mean number of questions answered correctly; kg/month for a regression slope) and are often most easily understood when reported in original units. It can often be valuable to report an effect size not only in original units but also in some standardized or units-free unit (e.g., as a Cohen’s d value) or a standardized regression weight. Multiple degree-of-freedom effect-size indicators are often less useful than effect-size indicators that decompose multiple degree-of-freedom tests into meaningful one degree-of-freedom effects.” (Publication Manual, p. 34) Chapter 2

12 12 APA Publication Manual, 6th ed: Reporting CIs From p. 117: “was also statistically significant… t(177) = 3.51, p <.001, d = 0.65, 95% CI [0.35, 0.95].” “R 2 =.25, ∆R 2 =.04, F(1, 143) = 7.63, p =.006, 95% CI [.13,.37].”  No need to repeat “95% CI” within the same paragraph, if meaning clear: “… 95% CIs [5.62, 8.31], [-2.43, 4.31], and [-4.29, -3.11], respectively.”  Don’t repeat the units when stating the CI: “M = 30.5 cm, 99% CI [18.0, 43.0]” Strangely, no other discipline seems to have a well-recognised format for CI reporting! Even medicine, which has used CIs for 30 years! My suggestion:  Always use 95%, unless excellent reasons for some other %. Chapters 1, 2

13 13 Manual: Reporting CIs in tables and figures Tables, pp : “When a table includes point estimates, for example, means, correlations, or regression slopes, it should also, where possible, include confidence intervals.” In a table use either (see examples, pp ): - a column of […, …] values, or - separate columns for the lower limit (LL), and upper limit (UL) values. Figures, pp : “Figures can be used to illustrate the results… with error bars representing precision of the… estimates”. “If your graph includes error bars, explain whether they represent standard deviations, standard errors, confidence limits, or ranges.” Sadly, the only example error bars are SE bars. Cumming, G., Fidler, F., & Vaux, D. L. (2007). Error bars in experimental biology. Journal of Cell Biology, 177, tinyurl.com/errorbars101tinyurl.com/errorbars101 Chapters 3, 4

14 14 CIs and replication: Statistical cognition  Click to indicate 10 ‘plausible’ replication means  Most researchers do reasonably well  BUT they underestimate variability (most think a 95% CI captures 95% of future means)  In fact, on average, captures 83%  A CI tells us what’s likely to happen next time, or what might have been!  A CI is much more informative than a p value Cumming, G., Williams, J., & Fidler, F. (2004). Replication, and researchers’ understanding of confidence intervals and standard error bars. Understanding Statistics, 3, Chapter 5 95% CI Where will the next mean fall? An internet experiment.

15 15 Some topics:  Compare two conditions—independent, or paired data  Randomised control trial (RCT)  CI on correlation, r  CI on proportion, P  Cohen’s d, and CI on d  Statistical power  Precision for planning  Meta-analysis Chapters The New Statistics: How?

16 16 CIs on ESs  All our CIs so far have been on means, and have been symmetric (upper arm = lower arm)  But we also need to have CIs for other ESs. Consider:  Proportion ) …CIs on these ESs are, in general  Correlation (Pearson r) ) not symmetric, and sometimes  Cohen’s d ) can be tricky to calculate Chapter 14

17 17 CI on Proportion, P  Proportions lie within [0, 1], for example:  Proportion of patients who, after therapy, no longer meet DSM criteria for the initial diagnosis  Proportion of responses that were errors (may be very low)  Limits at 0 and 1 mean we expect CIs to be asymmetric  Excellent approx CIs: Altman, D. G., Machin, D., Bryant, T. N., & Gardner, M. J. (2000). Statistics with confidence: Confidence intervals and statistical guidelines (2nd ed.). London: British Medical Journal Books. Finch, S., & Cumming, G. (2009). Putting research in context: Understanding confidence intervals from one or more studies. Journal of Pediatric Psychology, 34, Proportions and Diff proportions pages of ESCI Effect sizesESCI Effect sizes Chapter 14

18 18 Example use of CIs on proportion, P Difference between two proportions (instead of  2 ) ES = 17/20 – 11/20 =.30, [.02,.53] Diff proportions page of ESCI Effect sizesESCI Effect sizes Finch, S., & Cumming, G. (2009). Putting research in context: Understanding confidence intervals from one or more studies. Journal of Pediatric Psychology, 34, Chapter 14

19 19 CI on correlation, r  Use Fisher’s r to z transformation  CIs are asymmetric, especially for r near -1 or 1  CIs are shorter when near -1 or 1  CIs may seem surprisingly wide, unless N is large r to z and Two correlations pages of ESCI chapters 14-15ESCI chapters Correlations and Diff correlations pages of ESCI Effect sizesESCI Effect sizes Finch, S., & Cumming, G. (2009). Putting research in context: Understanding confidence intervals from one or more studies. Journal of Pediatric Psychology, 34, Chapter 14

20 20 Cohen’s d (A standardised effect size) Cohen’s d is number of SDs by which two conditions differ (a z score) d picture page of ESCI chapters 10-13ESCI chapters  Lots of overlap of the populations!  For d = 0.5 (a medium effect), 69% of E points higher than C mean!  Cohen chose medium = 0.5 as, roughly, a typical, noticeable amount that is of interest in behavioural and social science  Cohen’s small, medium, large: 0.2, 0.5, 0.8—but arbitrary!  Cohen’s d is the ES (in original units), divided by a suitable SD  Our sample d is a point estimate of the population  Chapter 11

21 21 Calculating Cohen’s d, for 2 independent groups Option 1 Use some known or assumed population SD,  If  = 4.0, d = 2.00/4.0 = Option 2 Use the SD of the Control group s = 3.964, d = 2.00/3.964 = Option 3 (most commonly used) Use the pooled within-group SD (as for the t-test) s = 4.209, d = 2.00/4.209 = 0.475[Hedge’s g… (!) Prefer Option 2 if one condition is a ‘base’ or ‘reference’ condition; and Option 3 if not, especially if sample sizes are small. Cumming, G., & Finch, S. (2001). A primer on the understanding, use and calculation of confidence intervals based on central and noncentral distributions. Educational and Psychological Measurement, 61, Chapter 11

22 22 CIs for Cohen’s d, for 2 independent groups Option 1: Easy! Just find CI for the diff between means, then divide by  Options 2 and 3: Tricky! Need noncentral t distribution! (Chapter 10 and fairytale “How the noncentral t distribution got its hump” tinyurl/noncentralt )tinyurl/noncentralt For example, option 3: Both numerator and denominator have sampling variability, so distribution of d is weird. Noncentral t ! The rubber ruler (!): the SD as an elastic unit of measurement. d heap and CI for d pages of ESCI chapters 10-13ESCI chapters Or, for an excellent approximate method of calculating CIs for d : Cumming, G., & Fidler, F. (2009). Confidence intervals: Better answers to better questions. Zeitschrift für Psychologie / Journal of Psychology, 217, Chapter 11

23 23 Unbiased estimate of  is d unb Unfortunately d overestimates . The unbiased estimate of d is: Multiply d by the adjustment factor to get d unb. Routinely use d unb (sometimes called Hedges’ g, but terminology a mess!) Data two and Data paired pages of ESCI chapters 5-6 Chapter 11 Degrees of freedom dfAdjustment factorPercent bias of d % % % % % %

24 24 Power, and precision  Consider the “sensitivity”, or “informativeness” of our experiment—the power, or precision  NHST world: Statistical power  …the chance we’ll find something, if it is there  Estimation world: The MOE (half the width of a CI; the length of one arm) is a measure of precision  How large an N should we use, to get MOE no longer than XX? Chapters 12, 13

25 25 Statistical power: I’m ambivalent! A Type 1 error is rejecting H 0, when it is true (Prob =  ) A Type 2 error is failing to reject H 0, when there is a true effect (Prob =  ) Power = 1 –  = Prob(reject H 0 IF H 0 false) Power is the chance we’ll find an effect, if there is an effect (High power is good!) At right: Single sample, N = 18,  =.05,  =.5, power =.52 Power picture page of ESCI chapters To calculate power, we need the non- central t distribution—unless  is known Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions. Educational and Psychological Measurement, 61, 530–572. Chapter 12

26 26 Power recommendations  APA Manual: “Take seriously the statistical power considerations associated with the tests of hypotheses. … routinely provide evidence that the study has sufficient power to detect effects of substantive interest…” (p. 30)  BUT power values are very rarely reported in psychology journals Chapter 12

27 27 Statistical power Power depends on:  N, the sample size (larger n, higher power)  An EXACT target ES, the size of effect we’re looking for (larger effect, higher power)  Therefore, to calculate power, need to state the ES. “Our experiment had power of.8 to find difference of 5.0 units on the anxiety scale.” (Use expertise in the field to choose ES.)  Or: “…to find a medium-sized effect (  = 0.5).”  Other things—notably , 1 or 2 tails, and the experimental design Chapter 12

28 28 Statistical power: Some values Power two and Power paired pages of ESCI chapters 10-13ESCI chapters  Two independent groups,  =.05:  For  = 0.5 (medium effect), power =.5 if N = 32 for each group  For power =.8,   = 0.5, need N = 64 (!!) and N = 95 with  =.01  Scope for fudging! (Grant applications, ethics proposals…)  E.g. two independent groups,  =.05, N = 70, then:  For  = 0.3, 0.4, 0.5 we get power =.42,.65,.84 Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd. ed.). New York: Academic Press. Software: Gpower tinyurl.com/gpower3tinyurl.com/gpower3 Chapter 12

29 29 Statistical power in psychology: Often so low!  Cohen (1962): In published psychology research, the median power to find a medium-sized effect is about.5.  Maxwell (2004): It was still about.5.  Our journals (and file drawers) are crammed with Type 2 errors: Results that are statistically insignificant (ns) even though there is a real effect! “One can only speculate on the number of potentially fruitful lines of investigation which have been abandoned because Type 2 errors were made…” Cohen (1962) Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, Chapter 12

30 30 Post hoc power: A bad idea!  Calculated after data are obtained. Use obtained d as target   Obtain mean difference of 2.6 anxiety units, maybe power =.35  If we’d found a difference of 7.2 units, post hoc power might be.83  Replicate, and see ‘dance of post hoc power’! Mad! Simulate two page of ESCI chapters 5-6ESCI chapters 5-6  Devastatingly criticised as not telling us what we want to know (chance we’ll find an effect of a size chosen to be meaningful)  Merely reflects the outcome of our study. Tells us nothing new.  SPSS, etc, gives post hoc power in its printouts. (NAUGHTY!) Never use this value! (A cop out by the software publishers!) Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician, 55, Chapter 12

31 31

32 32 Precision  Power has meaning only in the context of NHST  When using estimation, the corresponding concept is precision, as indexed by MOE  Large MOE, low precision  Small MOE, high precision  APA Manual: “…use calculations based on a chosen target precision (confidence interval width) to determine sample sizes.” (p. 31) Chapter 13 MOE

33 33 Precision for planning (AIPE, accuracy in parameter estimation)  Calculate what N is required to give:  expected MOE no more than f × , (so f is like d, a number of SDs)  OR to have a 99% chance MOE is no more than f ×   ‘assurance’ = 99%, expressed as  = 99 Three Precision pages of ESCI chapters 5-6ESCI chapters 5-6  Not yet widely used, but highly recommended (No need for H 0 !)  For example, f = 0.4, two independent groups, need N = 50  And for  = 99, need N = 65  Such large N, even with such large f ! Chapter 13 f × 

34 34 Low power, poor precision!? What can we do? Informativeness—my general term for quality, size, sensitivity To increase informativeness (also precision & power):  Choose experimental design to minimise error  Improve the measures, maybe measure twice and average  Target large effect sizes: Six therapy sessions, not two  Use large N (Phew!)—tho’ to halve SE, need to multiply N by 4!  Use Meta-analysis (combine results over experiments)  Yay! …very soon now… An essential step in research planning, worth great effort! Brainstorm! Chapter 12

35 35 Single experiments—So many problems!  Dance of the p values—so wide!  Power often so low!  CIs often so wide, precision low!  CIs report accurately the uncertainty in data.  But don’t shoot the messenger—it’s a message we need to hear The solutions:  Increase informativeness of individual studies  Combine results over studies—Meta-analysis Chapter 7

36 36 The New Statistics: How? Estimation: The six-step plan 1. Use estimation thinking. State estimation questions as: “How much…?”, “To what extent…?”, “How many…?” Key to a more quantitative discipline? 2. Indentify the ESs that best answer the questions 3. From the data, calculate point and interval estimates (CIs) for those ESs 4. Make a picture, including CIs 5. Interpret 6. Use meta-analytic thinking at every stage Cumming, G., & Fidler, F. (2009). Confidence intervals: Better answers to better questions. Zeitschrift für Psychologie / Journal of Psychology, 217, Chapters 1, 2, 15

37 37 Meta-analysis: Does psychotherapy work? Gene Glass (1976), presidential address to AERA Combine 375 studies, find overall average d = 0.68 (medium+)  On average, 75% of patients, after therapy, are above the mean of untreated patients  A person initially at the mean, on average moves to the 75 th percentile. Chapter 7 Hunt: The gripping MA story: How it saved social and behavioural research funding. And guides choice of medical treatments. And is our best chance for saving the world! Hunt, M. (1997). How science takes stock. The story of meta-analysis. New York: Sage

38 38 The meta-analysis picture  The forest plot  CIs make this picture possible; p values are irrelevant ESCI Meta-Analysis  First year students easily grasp the basics  Meta-analysis should appear in the intro stats course!  Effect sizes used in meta-analysis: Means, Cohen’s d, r, others… Cooper, H. M. (2009). Research synthesis and meta-analysis: A step-by-step approach (4 th ed.). Thousand Oaks, CA: Sage. Cumming, G. (2006b). Meta-analysis: Pictures that explain how experimental findings can be integrated. 7th International Conference on Teaching Statistics. Brazil, July. tinyurl.com/teachmatinyurl.com/teachma Chapter 7

39 39 Meta-analysis: Small or large SMALL  Combine 2 or 3 results, or studies LARGE e.g. Cooper’s seven steps 1. Formulate the questions, and scope of the systematic review 2. Search and obtain literature, contact researchers, find grey literature  Establish selection criteria, read and select studies 3. Code studies, enter ES estimates and coding of study features 4. Choose what to include, and design the analyses 5. Analyse the data. Prefer random effects model. 6. Interpret; draw empirical, theoretical, and applied conclusions. 7. Prepare critical discussion, present the review 8. Receive $1,000,000 and gold medal. Retire early. (Joke, alas.) Chapter 9

40 40 Health sciences: The Cochrane Collaboration  Systematic reviews: meta-analytic summaries of research  Freely available over the internet  Publicly available if your country subscribes  2,000+ reviews, aiming for 10,000+  28,000+ people in 100+ countries  Aim to update every two years (!)  Includes some psychology  Will psychology join, or should it do its own thing?  Campbell collaboration (social sciences, some psychology) Chapter 9

41 41 Chapter 9

42 42

43 43

44 44 Models for meta-analysis Fixed effect (FE) model  Assume every study estimates the same population ES:   Assumes studies homogeneous: Only vary because of sampling variability Random effects (RE) model  Assumes Study i estimates  i, randomly chosen from N ( ,  2 )  Measures of heterogeneity: Q,  2, I 2. Study-to-study variation  —in excess of that expected from sampling variability (cf. dance of the means)  Always (virtually always) choose random effects model  RE and FE weight studies differently, and usually give different results  If heterogeneity low, RE gives same result as FE Chapter 8

45 45 But there’s more: Moderator analysis  If heterogeneity, look for moderators that may account for it  Simplest: Dichotomous moderator? (e.g., gender) Subgroups page of ESCI Meta-analysisESCI Meta-analysis  Identify moderator, even if no study manipulated that variable!  Meta-analysis can give empirical summaries, but also:  theoretical progress, and  research guidance. Gold!  Example: Peter Wilson, clumsy children, meta-analysis of 50 studies  Identify performance on complex visuospatial tasks as moderator  Conduct empirical study on this moderator Chapter 9

46 46 Continuous moderator? Meta-regression  Fletcher & Kerr (2010): Does RTG fade with length of relationship?  Meta-regression of ES values (RTG score) against years, 13 studies  Correlation, not causality. Alternative interpretations? Chapter 9

47 47 MA in the Publication Manual  Many mentions, esp. pp , 183.  Mainstreaming meta-analysis!  MARS (Meta-Analysis Reporting Standards)  pp  A further big advantage of the sixth edition Cooper, H. (2010). Reporting research in psychology: How to meet Journal Article Reporting Standards (APA Style). Washington, DC: APA Books. Chapter 9

48 48 CMA: Software for meta-analysis  Comprehensive Meta Analysis  Enter ES, and its variance, for each study—in 100+ formats!  CMA calculates weighted combined ES, using FE or RE model  Assess heterogeneity of studies  Explore moderators (ANOVA, or meta-regression)  Forest plot  (Another software option: RevMan, from Cochrane website) Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. New York: Wiley. Chapters 8, 9

49 49 Assessing possible publication bias  Funnel plot: graph of SE (or variance) of the ES of a study (Vertical axis, high values at top) against ES (Horiz axis).  Do small studies (near bottom) have large ES? If so, small studies obtaining small ES may be missing—not published. (In the file drawer!)  In this example: Yes! Chapter 9 Subgroups page of ESCI Meta-analysisESCI Meta-analysis ES SE No difference

50 50 Meta-analytic thinking 1. Think of past literature in meta-analytic terms 2. Think of our study as the next step in that progressively cumulating meta-analysis 3. Report results so inclusion in future meta-analysis is easy  Report all effect sizes (whether ns or not), in the best way  Manual: “…be sure to include small effect sizes (or statistically nonsignificant findings)…” p. 32. Nothing in the file drawer! Cumming, G., & Finch, S. (2001). A primer on the understanding, use and calculation of confidence intervals based on central and noncentral distributions. Educational and Psychological Measurement, 61, Chapters 1, 7, 9

51 51 The New Statistics: Actually doing it! The editor says to remove CIs and just give p values! What do we DO?!  Be strong! The reasons for TNS are compelling, and TNS is the way of the future. It’s worth persisting!  Manual: “Wherever possible, base discussion and interpretation of results on point and interval estimates” (p. 34). That’s a great imprimatur!  The evidence should decide: Cite statistical cognition research.  Numerous scholars have criticised NHST, hardly anyone has replied!  I add in p values if I must, but I won’t remove CIs (or ESs). Chapter 15

52 52 The New Statistics: How? Estimation: The six-step plan 1. Use estimation thinking. State estimation questions as: “How much…?”, “To what extent…?”, “How many…?” Key to a more quantitative discipline? 2. Indentify the ESs that best answer the questions 3. From the data, calculate point and interval estimates (CIs) for those ESs 4. Make a picture, including CIs 5. Interpret 6. Use meta-analytic thinking at every stage Cumming, G., & Fidler, F. (2009). Confidence intervals: Better answers to better questions. Zeitschrift für Psychologie / Journal of Psychology, 217, Chapters 1, 2, 15

53 53 Queries or comments to: Geoff’s brief radio talk: tinyurl.com/geofftalk Geoff’s short magazine article: tiny.cc/GeoffConversation Preface, contents & sample chapter: tinyurl.com/tnschapter7 Dance of the p values: tinyurl.com/danceptrial2 Book info, and ESCI: Hug a confidence interval today!


Download ppt "Geoff Cumming: LAM, Paris 2 (Friday 11 May, 2012) Workshop: The New Statistics in Practice In this workshop I will discuss how to use the new statistics."

Similar presentations


Ads by Google