Presentation on theme: "Chapter 8 Introduction to Hypothesis Testing"— Presentation transcript:
1 Chapter 8 Introduction to Hypothesis Testing PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick J. Gravetter and Larry B. Wallnau
2 Chapter 8 Learning Outcomes 1Understand logic of hypothesis testing2State hypotheses and locate critical region(s)3Conduct z-test and make decision4Define and differentiate Type I and Type II errors5Understand effect size and compute Cohen’s d6Make directional hypotheses and conduct one-tailed test
3 Tools You Will Need z-Scores (Chapter 5) Distribution of sample means (Chapter 7)Expected valueStandard errorProbability and sample means
4 8.1 Hypothesis Testing Logic Hypothesis testing is one of the most commonly used inferential proceduresDefinition: a statistical method that uses sample data to evaluate the validity of a hypothesis about a population parameter
5 Logic of Hypothesis Test State hypothesis about a populationPredict the expected characteristics of the sample based on the hypothesisObtain a random sample from the populationCompare the obtained sample data with the prediction made from the hypothesisIf consistent, hypothesis is reasonableIf discrepant, hypothesis is rejected
6 Figure 8.1 Basic Experimental Design FIGURE 8.1 The basic experimental situation for hypothesis testing. It is assumed that the parameter μ is known for the population before treatment. The purpose of the experiment is to determine whether the treatment has an effect on the population mean.
7 Figure 8.2 Unknown Population in Basic Experimental Design FIGURE 8.2 From the point of view of the hypothesis test, the entire population receives the treatment and then a sample is selected from the treated population. In the actual research study, a sample is selected from the original population and the treatment is administered to the sample. From either perspective, the result is a treated sample that represents the treated population.
8 Four Steps in Hypothesis Testing Step 1: State the hypotheses Step 2: Set the criteria for a decision Step 3: Collect data; compute sample statistics Step 4: Make a decision
9 Step 1: State Hypotheses Null hypothesis (H0) states that, in the general population, there is no change, no difference, or is no relationshipAlternative hypothesis (H1) states that there is a change, a difference, or there is a relationship in the general population
10 Step 2: Set the Decision Criterion Distribution of sample outcomes is dividedThose likely if H0 is trueThose “very unlikely” if H0 is trueAlpha level, or significance level, is a probability value used to define “very unlikely” outcomesCritical region(s) consist of the extreme sample outcomes that are “very unlikely”Boundaries of critical region(s) are determined by the probability set by the alpha level
11 Figure 8.3 Note “Unlikely” Parts of Distribution of Sample Means FIGURE 8.3 The set of potential samples is divided into those that are likely to be obtained and those that are very unlikely to be obtained if the null hypothesis is true.
12 Figure 8.4 Critical region(s) for α = .05 FIGURE 8.4 The critical region (very unlikely outcomes) for alpha = .05
13 Learning CheckA sports coach is investigating the impact of a new training method. In words, what would the null hypothesis say?AThe new training program produces different results from the existing oneBThe new training program produces results about like the existing oneCThe new training program produces better results than the existing oneDThere is no way to predict the results of the new training program
14 Learning Check - Answer A sports coach is investigating the impact of a new training method. In words, what would the null hypothesis say?AThe new training program produces different results from the existing oneBThe new training program produces results about like the existing oneCThe new training program produces better results than the existing oneDThere is no way to predict the results of the new training program
15 Learning CheckDecide if each of the following statements is True or False.T/FIf the alpha level is decreased, the size of the critical region decreasesThe critical region defines unlikely values if the null hypothesis is true
16 Learning Check - Answers TrueAlpha is the proportion of the area in the critical region(s)This is the definition of “unlikely”
17 Step 3: Collect Data (and…) Data always collected after hypotheses statedData always collected after establishing decision criteriaThis sequence assures objectivity
18 Step 3: (continued)… Compute Sample Statistics Compute a sample statistic (z-score) to show the exact position of the sampleIn words, z is the difference between the observed sample mean and the hypothesized population mean divided by the standard error of the mean
19 Step 4: Make a decisionIf sample statistic (z) is located in the critical region, the null hypothesis is rejectedIf the sample statistic (z) is not located in the critical region, the researcher fails to reject the null hypothesis
20 Jury Trial: Hypothesis Testing Analogy Trial begins with the null hypothesis “not guilty” (defendant’s innocent plea)Police and prosecutor gather evidence (data) relevant to the validity of the innocent pleaWith sufficient evidence against, jury rejects null hypothesis innocence claim to conclude “guilty”With insufficient evidence against, jury fails to convict, i.e., fails to reject the “not guilty” claim (but does not conclude defendant is innocent)
21 Learning CheckDecide if each of the following statements is True or False.T/FWhen the z-score is quite extreme, it shows the null hypothesis is trueA decision to retain the null hypothesis means you proved that the treatment has no effect
22 Learning Check - Answer FalseAn extreme z-score is in the critical region—very unlikely if H0 is trueFailing to reject H0 does not prove it true; there is just not enough evidence to reject it
23 8.2 Uncertainty and Errors in Hypothesis Testing Hypothesis testing is an inferential processUses limited information from a sample to make a statistical decision, and then from it a general conclusionSample data used to make the statistical decision allows us to make an inference and draw a conclusion about a populationErrors are possible
24 Type I ErrorsResearcher rejects a null hypothesis that is actually trueResearcher concludes that a treatment has an effect when it has noneAlpha level is the probability that a test will lead to a Type I error
25 Type II ErrorsResearcher fails to reject a null hypothesis that is really falseResearcher has failed to detect a real treatment effectType II error probability is not easily identified
26 Table 8.1 Type I error (α) Type II error (β) Actual Situation No Effect =H0 TrueEffect Exists =H0 FalseResearcher’s DecisionReject H0Type I error (α)Decision correctFail to reject H0Type II error (β)
27 Figure 8.5 Location of Critical Region Boundaries FIGURE 8.5 The locations of the critical region boundaries for three different levels of significance: α = .05, α = .01, and α = .001.
28 Learning CheckDecide if each of the following statements is True or False.T/FA Type I error is like convicting an innocent person in a jury trialA Type II error is like convicting a guilty person in a jury trial
29 Learning Check - Answer TrueInnocence is the “null hypothesis” for a jury trial; conviction is like rejecting that hypothesisFalseConvicting a guilty person is not an error; but acquitting a guilty person would be like Type II error
30 8.3 Hypothesis Testing Summary Step 1: State hypotheses and select alpha levelStep 2: Locate the critical regionStep 3: Collect data; compute the test statisticStep 4: Make a probability-based decision about H0: Reject H0 if the test statistic is unlikely when H0 is true—called a “significant” or “statistically significant” result
31 In the LiteratureA result is significant or statistically significant if it is very unlikely to occur when the null hypothesis is true; conclusion: reject H0In APA formatReport that you found a significant effectReport value of test statisticReport the p-value of your test statistic
32 Figure 8.6 Critical Region for Standard Test FIGURE 8.6 Sample means that fall in the critical region (shaded areas) have a probability less than alpha (p < α). In this case, H0 should be rejected. Sample means that do fall in the critical region have a probability greater than alpha (p > α).
33 8.3 Assumptions for Hypothesis Tests with z-Scores Random samplingIndependent ObservationValue of σ is not changed by the treatmentNormally distributed sampling distribution
34 Factors that Influence the Outcome of a Hypothesis Test Size of difference between sample mean and original population meanLarger discrepancies larger z-scoresVariability of the scoresMore variability larger standard errorNumber of scores in the sampleLarger n smaller standard error
35 Learning Check σ = 5 and n = 25 σ = 5 and n = 50 σ = 10 and n = 25 A researcher uses a hypothesis test to evaluate H0: µ = 80. Which combination of factors is most likely to result in rejecting the null hypothesis?Aσ = 5 and n = 25Bσ = 5 and n = 50Cσ = 10 and n = 25Dσ = 10 and n = 50
36 Learning Check - Answer A researcher uses a hypothesis test to evaluate H0: µ = 80. Which combination of factors is most likely to result in rejecting the null hypothesis?Aσ = 5 and n = 25Bσ = 5 and n = 50Cσ = 10 and n = 25Dσ = 10 and n = 50
37 Learning CheckDecide if each of the following statements is True or False.T/FAn effect that exists is more likely to be detected if n is largeAn effect that exists is less likely to be detected if σ is large
38 Learning Check - Answers TrueA larger sample produces a smaller standard error and larger zA larger standard deviation increases the standard error and produces a smaller z
39 8.4 Directional Hypothesis Tests The standard hypothesis testing procedure is called a two-tailed (non-directional) test because the critical region involves both tails to determine if the treatment increases or decreases the target behaviorHowever, sometimes the researcher has a specific prediction about the direction of the treatment
40 8.4 Directional Hypothesis Tests (Continued) When a specific direction of the treatment effect can be predicted, it can be incorporated into the hypothesesIn a directional (one-tailed) hypothesis test, the researcher specifies either an increase or a decrease in the population mean as a consequence of the treatment
41 Figure 8.7 Example 8.3 Critical Region (Directional) FIGURE 8.7 Critical region for Example 8.3.
42 One-tailed and Two-tailed Tests Compared One-tailed test allows rejecting H0 with relatively small difference provided the difference is in the predicted directionTwo-tailed test requires relatively large difference regardless of the direction of the differenceIn general two-tailed tests should be used unless there is a strong justification for a directional prediction
43 Learning CheckA researcher is predicting that a treatment will decrease scores. If this treatment is evaluated using a directional hypothesis test, then the critical region for the test.Awould be entirely in the right-hand tail of the distributionBwould be entirely in the left-hand tail of the distributionCwould be divided equally between the two tails of the distributionDcannot answer without knowing the value of the alpha level
44 Learning Check - Answer A researcher is predicting that a treatment will decrease scores. If this treatment is evaluated using a directional hypothesis test, then the critical region for the test.Awould be entirely in the right-hand tail of the distributionBwould be entirely in the left-hand tail of the distributionCwould be divided equally between the two tails of the distributionDcannot answer without knowing the value of the alpha level
45 8.5 Hypothesis Testing Concerns: Measuring Effect Size Although commonly used, some researchers are concerned about hypothesis testingFocus of test is data, not hypothesisSignificant effects are not always substantialEffect size measures the absolute magnitude of a treatment effect, independent of sample sizeCohen’s d measures effect size simply and directly in a standardized way
46 Cohen’s d : Measure of Effect Size Magnitude of dEvaluation of Effect Sized = 0.2Small effectd = 0.5Medium effectd = 0.8Large effect
47 Figure 8.8 When is a 15-point Difference a “Large” Effect? FIGURE 8.8 The appearance of a 15-point treatment effect in two different situations. In part (a), the standard deviation is σ = 100 and the 15-point effect is relatively small. In part (b), the standard deviation is σ = 15 and the 15-point effect is relatively large. Cohen’s d uses the standard deviation to help measure effect size.
48 Learning CheckDecide if each of the following statements is True or False.T/FIncreasing the sample size will also increase the effect sizeLarger differences between the sample and population mean increase effect size
49 Learning Check -Answers FalseSample size does not affect Cohen’s dTrueThe mean difference is in the numerator of Cohen’s d
50 8.6 Statistical PowerThe power of a test is the probability that the test will correctly reject a false null hypothesisIt will detect a treatment effect if one existsPower = 1 – β [where β = probability of a Type II error]Power usually estimated before starting studyRequires several assumptions about factors that influence power
51 Figure 8.9 Measuring Statistical Power FIGURE 8.9 A demonstration of measuring power for a hypothesis test. The left-hand side shows the distribution of sample means that would occur if the null hypothesis is true. The critical region is defined for this distribution. The right-hand side shows the distribution of sample means that would be obtained if there were an 8-point treatment effect. Notice that, if there is an 8-point effect, essentially all of the sample means would be in the critical region. Thus, the probability of rejecting H0 (the power of the test) would be nearly 100% for an 8-point treatment effect.
52 Influences on Power Increased Power Decreased Power As effect size increases, power also increasesLarger sample sizes produce greater powerUsing a one-tailed (directional) test increases power (relative to a two-tailed test)Decreased PowerReducing the alpha level (making the test more stringent) reduces powerUsing two-tailed (non-directional) test decreases power (relative to a one-tailed test)
53 Figure 8.10 Sample Size Affects Power FIGURE A demonstration of how sample size affects the power of a hypothesis test. As in Figure 8.9, the left-hand side shows the distribution of sample means if the null hypothesis were true. The critical region is defined for this distribution. The right-hand side shows the distribution of sample means that would be obtained if there were an 8-point treatment effect. Notice that reducing the sample size to n = 4 has reduced the power of the test to less than 50% compared to a power of nearly 100% with a sample of n = 25 in Figure 8.9.
54 Learning CheckThe power of a statistical test is the probability of _____Arejecting a true null hypothesisBsupporting true null hypothesisCrejecting a false null hypothesisDsupporting a false null hypothesis
55 Learning Check - Answer The power of a statistical test is the probability of _____Arejecting a true null hypothesisBsupporting true null hypothesisCrejecting a false null hypothesisDsupporting a false null hypothesis
56 Learning CheckDecide if each of the following statements is True or False.T/FCohen’s d is used because alone, a hypothesis test does not measure the size of the treatment effectLowering the alpha level from .05 to .01 will increase the power of a statistical test
57 Answer Differences might be significant but not of substantial size TrueDifferences might be significant but not of substantial sizeFalseIt is less likely that H0 will be rejected with a small alpha