Lecture 3: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin.

Slides:



Advertisements
Similar presentations
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Advertisements

Lecture 2: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin.
Statistics 101 Class 8. Overview Hypothesis Testing Hypothesis Testing Stating the Research Question Stating the Research Question –Null Hypothesis –Alternative.
Statistical Issues in Research Planning and Evaluation
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
RIMI Workshop: Power Analysis Ronald D. Yockey
Winslow Homer: “On The Stile” INFERENTIAL PROBLEM SOLVING Hypothesis Testing and t-tests Chapter 6:
Thursday, September 12, 2013 Effect Size, Power, and Exam Review.
Hypothesis testing Week 10 Lecture 2.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Statistics for the Social Sciences
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Lecture 11 Psyc 300A. Null Hypothesis Testing Null hypothesis: the statistical hypothesis that there is no relationship between the variables you are.
Lecture 4: Correlation and Regression Laura McAvinue School of Psychology Trinity College Dublin.
1 Practicals, Methodology & Statistics II Laura McAvinue School of Psychology Trinity College Dublin.
Lecture 9: One Way ANOVA Between Subjects
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
The t Tests Independent Samples.
Chapter 8: Hypothesis Testing and Inferential Statistics What are inferential statistics, and how are they used to test a research hypothesis? What is.
Chapter 14 Inferential Data Analysis
Descriptive Statistics
Inferential Statistics
The problem of sampling error in psychological research We previously noted that sampling error is problematic in psychological research because differences.
Inferential Statistics
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Statistics for the Social Sciences
AM Recitation 2/10/11.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
Chapter 8 Introduction to Hypothesis Testing
Comparing Means From Two Sets of Data
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
Chapter 11Prepared by Samantha Gaies, M.A.1 –Power is based on the Alternative Hypothesis Distribution (AHD) –Usually, the Null Hypothesis Distribution.
Chapter 7 Statistical Issues in Research Planning and Evaluation.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
The t Tests Independent Samples. The t Test for Independent Samples Observations in each sample are independent (not from the same population) each other.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Chapter 15 Data Analysis: Testing for Significant Differences.
Chapter 8 Introduction to Hypothesis Testing
User Study Evaluation Human-Computer Interaction.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
Chapter 12 A Primer for Inferential Statistics What Does Statistically Significant Mean? It’s the probability that an observed difference or association.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Test for Significant Differences T- Tests. T- Test T-test – is a statistical test that compares two data sets, and determines if there is a significant.
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
Hypothesis test flow chart
Chapter 13 Understanding research results: statistical inference.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
CHAPTER 7: TESTING HYPOTHESES Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Inferential Statistics Psych 231: Research Methods in Psychology.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Chapter 9 Introduction to the t Statistic
Chapter 9: Hypothesis Tests for One Population Mean 9.5 P-Values.
Hypothesis Testing.
Statistics for the Social Sciences
Chapter 8: Hypothesis Testing and Inferential Statistics
Presentation transcript:

Lecture 3: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin

Previous Lectures Inferential Statistics –Sample Population Null Hypothesis Significance Testing –Proceeds in series of steps –Allows us to assess the statistical significance of our results –To reject or accept the H o on the basis of the p value

Previous Lectures Misleading nature of statistical significance –Results can be labelled as ‘Statistically significant’ ‘Not statistically significant’ –People interpret results in a cut and dried fashion ‘Statistically significant result means there is a true effect in the population’ ‘Non-significant result means there is no true effect’

Previous Lectures NHST is not so straightforward Statistical significance is affected by –One or two tailed test –Significance level /  / probability of Type I error –Power / Probability of Type II error –Sample size These factors must be considered –Research evaluation –Research planning

Research Evaluation A result is statistically significant –Implies a true effect exists in the population –But is this effect clinically significant? How big if the effect? Real world relevance? Recall that a large enough sample size will make a small effect statistically significant

Research Evaluation A result is not statistically significant –Implies a true effect does not exist in the population –Power Did the study have enough power to identify an effect as statistically significant even if a true effect existed?

Research Planning Power –Require enough power to obtain statistically significant results if a true effect exists Sample Size –Obtain an adequate sample size

Effect Size NHST –Enables us to say whether or not a true effect exists in the population Effect Size –Provides an estimate of the size of this true effect –A measure of the degree to which the H o is false –A measure of the discrepancy between H o and H 1

00 11 ES 00 11 Small ES  0 -  1 = small Large ES  0 -  1 = large

Effect Size There is a different effect size measure for each statistical test The difference between two independent group means –Cohen’s d –  1 -  0 σ –Standardised difference –Express the difference between the means in terms of the standard deviation

Effect Size To calculate Cohen’s d for a study in which you compared two groups Mean treat – Mean control SD control For example, I compared the effects of an exercise regime and a control regime on physical fitness (rated /20) in two groups and obtained the following results…

Effect Size Mean rating in exercise group was 17 (SD = 10) Mean rating in control group was 11 (SD = 10) Cohen’s d was 17 – =.6 The exercise group had a mean rating.6 SDs higher than the control group You can use Cohen’s d to compare studies that have used different measures

Comparing Studies Four studies examined the effect of cognitive behavioural therapy on self- esteem but each study used a different scale to assess self-esteem. Calculate the effect size for each of the following studies Which study found the greatest effect? StudyTreatment Group Mean Control Group Mean Mean Difference SDd A B C12939 D

Comparing Studies Four studies examined the effect of cognitive behavioural therapy on self- esteem but each study used a different scale to assess self-esteem. Calculate the effect size for each of the following studies Which study found the greatest effect? StudyTreatment Group Mean Control Group Mean Mean Difference SDd A B C D

What is a big Effect Size? Cohen’s (1992) rules of thumb For independent t-tests comparing two means… SmallMediumLarge Cohen’s d Cohen, J. (1992). A power primer. Psychological Bulletin, 112 (1),

Research Evaluation A statistically significant result –Is it clinically significant? –Real world relevance? –Effect Size A non-significant result –No true effect? –Lack of power?

Calculating Power Recall that power is determined by a number of factors To calculate the power of an experiment you need to know –One or two-tailed test –Significance level  –Sample size –Effect size You calculate the power of an experiment to identify a certain effect size as statistically significant, using a one/two-tailed test with a certain  level and a certain sample size

Example: The effects of therapy on depression Analysis 1Analysis 2 Size of sample20200 Therapy mean score5.5 Therapy standard deviation Control mean score6.3 Control standard deviation Mean difference-.8 T statistic Df18198 P-value

Study 1Study 2 TestIndependent samples T-test One or two-tailedTwo-tailed Significance Level.05 Size of each group10100 Effect Size5.5 – – Power % chance of finding an ES of.3 as statistically significant at p <.05 using two- tailed test 56% chance of finding an ES of.3 as statistically significant at p <.05 using two- tailed test The difference in power for these two studies was due to sample size

Power Computer programmes can calculate power – –Free download of gpower3 package Research planning –Rather than computing power post hoc, best to plan to have adequate power to obtain statistically significant results if H o is false and a true effect exists –Convention Aim for power of.8 80% chance of obtaining significant results if H o is false.2 probability of Type II error 1 : 4 ratio of Type I (.05) to Type II (.2) errors

Power & Sample Size Main avenue for increasing power –Increase sample size Common question –How big a sample do I need? Answer depends –The power you want to have –Significance level you set –Effect size you expect to obtain –Statistical test you are running –One or two tailed prediction

Power & Sample Size The Real Question –“What sample size do I need to have power of ____ to detect an ES of ____ as being statistically significant at ____ level, when doing a ____ statistical test and making a ____-tailed prediction?” Most of the gaps are easy to complete –Power =.8 –  =.05 –Test=depends on experimental design –Prediction =depends on theory –ES=? Need to estimate effect size

Estimate Effect Size Pilot Study Do analysis on small group to give idea of results Previous Research Calculate ES in previously published studies Theory Based on theory or understanding of research area, estimate the ES or the smallest ES that would be of interest Cohen’s Standards Would you like to detect a small, medium or large effect? Difference between two groups Small (.2), Medium (.5), Large (.8)

Power & Sample Size Once you have decided on the following –Statistical test, prediction, Power,  and ES You can calculate necessary sample size in two ways –Computer package, such as gpower3 –Cohen’s tables Let’s try an example –Turn to the handout showing Cohen’s table of required sample size (note that this table refers to two-tailed predictions)

Calculating Required Sample Size I would like to investigate the difference between clinically anxious and normal people in relation to performance on an attention task “How many people do I need in each group to have power of.8 to detect a large ES as being statistically significant at.05 level, when doing an independent samples t-test and making a two-tailed prediction?”

Cohen’s Table N for Small, Medium, and large ES at power = 0.80 for  =.01,.05 and.10 We need 26 people in each group to have a power of 0.80 to detect a large ES as statistically significant at the 0.05 level

Some more practice! –For a two group independent t-test, how many people do I need in each group to detect… Large ES as statistically significant at.10 level _________ Large ES as statistically significant at.05 level _________ Large ES as statistically significant at.01 level _________ Medium ES as statistically significant at.01 level _________ Small ES as statistically significant at.01 level _________ –The smaller the alpha level, the _______________ the sample size required to detect a given difference as being statistically significant –The smaller the ES, the _______________ the sample size required to detect a given difference as being statistically significant

Summary Factors affecting Statistical Significance Research Evaluation Effect size Power Calculations Research Planning Sample Size Calculations