Lecture 2: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin.

Slides:



Advertisements
Similar presentations
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Advertisements

Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Statistics 101 Class 8. Overview Hypothesis Testing Hypothesis Testing Stating the Research Question Stating the Research Question –Null Hypothesis –Alternative.
Lecture 3: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Binomial Distribution & Hypothesis Testing: The Sign Test
Thursday, September 12, 2013 Effect Size, Power, and Exam Review.
Hypothesis testing Week 10 Lecture 2.
THE z - TEST n Purpose: Compare a sample mean to a hypothesized population mean n Design: Any design where a sample mean is found.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Statistics for the Social Sciences
Lecture 11 Psyc 300A. Null Hypothesis Testing Null hypothesis: the statistical hypothesis that there is no relationship between the variables you are.
1 Practicals, Methodology & Statistics II Laura McAvinue School of Psychology Trinity College Dublin.
Hypothesis testing & Inferential Statistics
1 Statistical Inference Note: Only worry about pages 295 through 299 of Chapter 12.
Lecture 9: One Way ANOVA Between Subjects
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
PSY 307 – Statistics for the Behavioral Sciences
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
Hypothesis Testing Using The One-Sample t-Test
Probability Population:
Inferential Statistics
Quantitative Methods – Week 7: Inductive Statistics II: Hypothesis Testing Roman Studer Nuffield College
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Statistics for the Social Sciences
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Presented by Mohammad Adil Khan
Introduction to Hypothesis Testing for μ Research Problem: Infant Touch Intervention Designed to increase child growth/weight Weight at age 2: Known population:
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
1 Statistical Inference. 2 The larger the sample size (n) the more confident you can be that your sample mean is a good representation of the population.
Chapter 8 Introduction to Hypothesis Testing
STA Statistical Inference
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Correct decisions –The null hypothesis is true and it is accepted –The null hypothesis is false and it is rejected Incorrect decisions –Type I Error The.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Welcome to MM570 Psychological Statistics
© Copyright McGraw-Hill 2004
Formulating the Hypothesis null hypothesis 4 The null hypothesis is a statement about the population value that will be tested. null hypothesis 4 The null.
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Course Overview Collecting Data Exploring Data Probability Intro. Inference Comparing Variables Relationships between Variables Means/Variances Proportions.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Hypothesis test flow chart
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Chapter 9 Introduction to the t Statistic
Chapter 9: Hypothesis Tests for One Population Mean 9.5 P-Values.
Statistics for the Social Sciences
Hypothesis Testing: Hypotheses
Testing Hypotheses I Lesson 9.
1 Chapter 8: Introduction to Hypothesis Testing. 2 Hypothesis Testing The general goal of a hypothesis test is to rule out chance (sampling error) as.
Presentation transcript:

Lecture 2: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin

Null Hypothesis Significance Testing Previous lecture, Steps of NHST –Specify the alternative/research hypothesis –Set up the null hypothesis –Collect data –Run the appropriate statistical test –Obtain the test statistic and associated p value –Decide whether to reject or fail to reject the null hypothesis on the basis of p value

Null Hypothesis Significance Testing Decision to reject or fail to reject H o –P value –Probability of obtaining the observed results if H o is true –By convention, use the significance level of p <.05 –Conclude that it is highly unlikely that we would obtain these results by chance, so we reject Ho –Caveat! The fact that there is a significance level does not mean that there is a simple ‘yes’ or ‘no’ answer to your research question

Null Hypothesis Significance Testing If you obtain results that are not statistically significant (p>.05), this does not necessarily mean that the relationship you are interested in does not exist There are a number of factors that affect whether your results come out as statistically significant –One and two-tailed tests –Type I and Type II errors –Power

One and Two-tailed Tests One-tailed / Directional Test –Run this when you have a prediction about the direction of the results Two-tailed / Non-Directional Test –Run this when you don’t have a prediction about the direction of the results

Recall previous example… Research Qu –Do anxiety levels of students differ from anxiety levels of young people in general? Prediction –Due to the pressure of exams and essays, students are more stressed than young people in general Method –You know the mean score for the normal young population on the anxiety measure = 50 –You predict that your sample will have mean > 50 –Run a one-tailed one-sample t test at p <.05 level

One-tailed Test Compare the mean of your sample to the sampling distribution for the population mean Decide to reject H o if your sample mean falls into the highest 5% of the sampling distribution

Dilemma But! What if your prediction is wrong? –Perhaps students are less stressed than the general young population Their own bosses, summers off, no mortgages –With previous one-tailed test, you could only reject Ho if you got an extremely high sample mean –What if you get an extremely low sample mean? Run a two-tailed test –Hedge your bets –Reject Ho if you obtain scores at either extreme of the distribution, very high or very low sample mean

Two-tailed Test You will reject H o when a score appears in the highest 2.5% of the distribution or the lowest 2.5% Note that it’s not the highest 5% and the lowest 5% as then you’d be operating at p =.1 level, rejecting Ho for 10% of the distribution So, we gain ability to reject Ho for extreme values at either end but values must be more extreme

Errors in NHST Howell (2008) p. 157 –“Whenever we reach a decision with a statistical test, there is always a chance that our decision is the wrong one” Misleading nature of NHST –Because there is a significance level (p =.05), people interpret NHST as a definitive exercise –Results are statistically significant or not –We reject H o or we don’t –The H o is wrong or right

Errors in NHST Remember we are dealing with probabilities –We make our decision on the basis of the likelihood of obtaining the results if H o is true –There is always the chance that we are making an error Two kinds of Error –We reject H o when it is true (Type I error) We say there’s a significant difference when there’s not –We accept H o when it is false (Type II error) We say there is no significant difference when there is

Type I Error Our anxiety example Predict that students will have greater anxiety score than young people in general Test H o that students’ anxiety levels do not differ from young people One-tailed one sample t-test at p <.05 Compare sample mean with sampling distribution of mean for the population (H o )

Type I Error Decide to reject H o if your sample mean falls in the top 5% of the distribution But! This 5%, even though at the extreme end, still belongs to the distribution If your sample mean falls within this top 5%, there is still a chance that your sample came from the H o population

Type I Error For example, if p =.04, this means that there is a very small chance that your sample mean came from that population, –But this is still a chance, you could be rejecting Ho when it is in fact true Researchers are willing to accept this small risk (5%) of making a Type I error, of rejecting Ho when it is in fact true Probability of making Type I error = alpha  = the significance level that you chose –.05,.01

Type II Error So why not set a very low significance level to minimise your risk of making a Type I error? –Set p <.01 rather than p <.05 As you decrease the probability of making a Type I error you increase the probability of making a Type II error Type II Error –Fail to reject H o when it is false –Fail to detect a significant relationship in your data when a true relationship exists

For argument’s sake, imagine that H 1 is correct Sampling Distribution under H o Sampling Distribution under H 1 Reject Ho if sample mean equals any value to the right of the critical value (red region) –Correct Decision Accept H o if sample mean equals any value to the left of the critical region –Type II Error

Four Outcomes of Decision Making True State of Nature DecisionH o is True H o is False Accept H o Correct DecisionType II Error Reject H o Type I ErrorCorrect Decision

Power You should minimise both Type I and Type II errors –In reality, people are often very careful about Type I (i.e. strict about  ) but ignore Type II altogether If you ignore Type II error, your experiment could be doomed before it begins –even if a true effect exists (i.e. H 1 is correct), if  is high, the results may not show a statistically significant effect How do you reduce the probability of a Type II error? –Increase the power of the experiment

Power Power –The probability of correctly rejecting a false H o –A measure of the ability of your experiment to detect a significant effect when one truly exists –1 - 

How do we increase the power of our experiment? Factors affecting power –The significance level (  ) –One-tailed v two-tailed test –The true difference between H o and H 1 (  o -  1 ) –Sample Size (n)

The Influence of  on Power Reduce the significance level (  )… –Reduce the probability of making a Type I error Rejecting the H o when it is true –Increase the probability of making a Type II error Accepting the H o when it is false –Reduce the power of the experiment to detect a true effect as statistically significant

Reduce  and reduce power

Increase  and increase power But! You increase the probability of a Type I error!

The Influence of One v Two-tailed Tests on Power We lose power with a two- tailed test –power is divided across the two tails of the experiment –Values must be more extreme to be statistically significant

The Influence of the True Difference between H o and H 1 The bigger the difference between  o and  1, the easier it is to detect it

The Influence of Sample Size on Power The bigger the sample size, the more power you have A big sample provides a better estimate of the population mean With bigger sample sizes, the sampling distribution for the mean clusters more tightly around the population mean Standard deviation of the sampling distribution, known as standard error the mean is reduced There is less overlap between the sampling distributions under H o and H 1 The power to detect a significant difference increases

The Influence of Sample Size on Power

Sample Size Exercise Open the following dataset –Software / Kevin Thomas / Power dataset (revised) –Explores the effects of Therapy on Depression Perform two Independent Samples t-test –Analyse / Compare means / Independent Samples t test –Group represents Therapy v Control –Score represents post-treatment depression –1. Group1 & Score1 –2. Group 2 & Score 2

Complete the following table Analysis 1Analysis 2 Size of sample Therapy mean score Therapy standard deviation Control mean score Control standard deviation Mean difference T statistic df P-value

What explains these results? Analysis 1Analysis 2 Size of sample20200 Therapy mean score5.5 Therapy standard deviation Control mean score6.3 Control standard deviation Mean difference-.8 T statistic Df18198 P-value

So, how do I increase the power of my study? You can’t manipulate the true difference between H o and H 1 You could increase your significance level (  ) but then you would increase the risk of a Type I error If you have a strong prediction about the direction of the results, you should run a one-tailed test The factor that is most under your control is sample size –Increase it!