Analysis and Interpretation Inferential Statistics ANOVA

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Chapter 12 ANALYSIS OF VARIANCE.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Analysis of Variance: Inferences about 2 or More Means
Statistics Are Fun! Analysis of Variance
1 Pertemuan 10 Analisis Ragam (Varians) - 1 Matakuliah: I0262 – Statistik Probabilitas Tahun: 2007 Versi: Revisi.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Inferences About Process Quality
SIMPLE LINEAR REGRESSION
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
CHAPTER 10 ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved.
SIMPLE LINEAR REGRESSION
Copyright © 2012 by Nelson Education Limited. Chapter 9 Hypothesis Testing III: The Analysis of Variance 9-1.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
EDRS 6208: Fundamentals of Educational Research 1
Two Sample Tests Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
CHAPTER 14 MULTIPLE REGRESSION
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Chapter 10 Analysis of Variance.
ANOVA (Analysis of Variance) by Aziza Munir
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Analysis of Variance.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Copyright © 2004 Pearson Education, Inc.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests and One-Way ANOVA Business Statistics, A First.
Comparing Two Variances
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Chapter 12 Analysis of Variance. An Overview We know how to test a hypothesis about two population means, but what if we have more than two? Example:
Testing Differences in Population Variances
© Copyright McGraw-Hill 2000
Lecture 9-1 Analysis of Variance
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Business Statistics: A First Course (3rd Edition)
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Chap 11-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 11 Analysis of Variance.
CHAPTER 12 ANALYSIS OF VARIANCE Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved.
CHAPTER 10 ANOVA - One way ANOVa.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
MM570Sec02 Zrotowski Unit 8, Unit 8, Chapter 9 Chapter 9 1 ANOVA 1 ANOVA.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Chapter 10 Two-Sample Tests and One-Way ANOVA.
CHAPTER 11 CHI-SQUARE TESTS
Hypothesis Test for Population Proportion (p)
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
i) Two way ANOVA without replication
Chapter 8 Hypothesis Testing with Two Samples.
CHAPTER 12 ANALYSIS OF VARIANCE
Chapter 12: Inference about a Population Lecture 6b
Statistics for the Social Sciences
HYPOTHESIS TESTS ABOUT THE MEAN AND PROPORTION
SIMPLE LINEAR REGRESSION
CHAPTER 11 CHI-SQUARE TESTS
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
SIMPLE LINEAR REGRESSION
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Analysis and Interpretation Inferential Statistics ANOVA EDRS6208 Analysis and Interpretation Inferential Statistics ANOVA Madgerie Jameson, UWI SOE

OUTLINE Definition of Analysis of Variance Logic of ANOVA ( the Theory behind ANOVA) The F test One way ANOVA Post Hoc Tests Interpreting the results

Analysis of Variance Suppose the Ministry of Education decides to test three different methods of teaching Mathematics. After teachers implemented the different methods for a term, the testing and measurement unit wanted to know if the mean scores of students taught with the three different methods are the same. Questions: What data would they require? How will they test for this equity of means?

Analysis of Variance Definition The Analysis of Variance (ANOVA) is statistical model that is used to analyse situations in which we want to compare more than two conditions. It is used to test the null hypothesis that the mean of three or more populations are equal.

Recall opening example The ministry developed three different methods to teach Mathematics. They want to determine whether the three methods produce different mean scores. So we test the null Hypothesis H0 : µ1 = µ2 = µ3 ( all three population means are equal) H1 : Not all three population means are equal.

You Ask Is there an overall average difference? Is this difference statistically significant? If so, is the size of the difference managerially significant? The three methods M2 M1 M3

You can Test the three hypotheses H0: µ1 = µ2 or Ho: µ1 = µ3 or Ho: µ2 =µ3 (using t test) If you reject even one of the three hypothesis, then you must reject the null Hypothesis “H0 : µ1 = µ2 = µ3” Combining the type ! Error probabilities for the three tests will give a large type 1 error probability test for H0 : µ1 = µ2 = µ3 The procedure that can test the equality of three means in one test is the ANOVA

One Way ANOVA A procedure to make tests by comparing the means of several population. In one way ANOVA, we analyse one factor or variable. Testing the equality of the mean of the Mathematics scores of students who are taught using the three different methods. One factor is considered the effect size of the different teaching methods.

Assumptions of One-Way ANOVA The following assumptions must hold true to use one-way ANOVA. The populations from which the samples are drawn are (approximately) normally distributed. The populations from which the samples are drawn have the same variance (or standard deviation). The samples drawn from different populations are random and independent. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Using the example of the three teaching methods we must assume: The scores of all the students taught by each method are ( approximately) normally distributed. The means of the all three distributions of scores for the three teaching methods may or may not be the same, but all three distributions have the same variance When we take samples from an ANOVA test these samples are drawn independently and randomly from three different populations.

The variance between samples ( mean square between samples MSB). ANOVA is applied by Calculating two estimates of the variance , of the population distribution The variance between samples ( mean square between samples MSB). It gives an estimate of the variance based on the samples taken from different populations e.g. the three teaching methods. MSB is based on the values of the mean scores of the three samples of students taught by the three different methods. If the mean of all populations under consideration are equal , the means of the prospective samples will still be different. but the variations among them is expected to be small. However if the means of the population under consideration are not all equal, the variation among the means of respective samples is expected to be large, and consequently, the value of MSB is expected to be large.

The variance within samples ( mean square within samples MSW). It gives an estimate of the variance within the data of different samples. MSW is based on the scores of individual students included in the three samples taken from the three population. The concept of MSW is similar to the concept of the pooled standard deviation, Sp

Note The one-way ANOVA test is always right-tailed with the rejection region in the right tail of the F distribution curve.

THE F DISTRIBUTION Definition The F distribution is a continuous curve skewed to the right. The F distribution has two numbers of degrees of freedom: df for the numerator and df for the denominator. The units of an F distribution, denoted F, are nonnegative. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

For an F distribution, degrees of freedom for the numerator and degrees of freedom for the denominator are usually written as follows. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Three F distribution curves. The figure shows three f distribution curves for three sets of degrees of freedom for the numerator and denominator. The fist number gives the degrees of freedom associated with the numerator, and the second number gives the degrees of freedom associated with the denominator. Notice as the degrees of freedom increase, the peak of the curve moves to the right, that is, skewness decreases. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Exercise Find the F value for 8 degrees of freedom for the numerator, 14 degrees of freedom for the denominator, and .05 area in the right tail of the F distribution curve. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Obtaining the F Value using the statistical table Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

The critical value of F for 8 df for the numerator, 14 df for the denominator, and .05 area in the right tail. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Calculating the Value of the Test Statistic Test Statistic F for a One-Way ANOVA Test The value of the test statistic F for an ANOVA test is calculated as Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Example Fifteen form one students were randomly assigned to three groups to experiment with three different methods of teaching Mathematics. At the end of the term, the same test was given to all 15 students. The table gives the scores of students in the three groups. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Calculate the value of the test statistic F Calculate the value of the test statistic F. Assume that all the required assumptions for ANOVA are assumed to hold true.

Solution Let x = the score of a student k = the number of different samples (or treatments) ni = the size of sample i Ti = the sum of the values in sample i n = the number of values in all samples = n1 + n2 + n3 + . . . Σx = the sum of the values in all samples = T1 + T2 + T3 + . . . Σx² = the sum of the squares of the values in all samples Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Calculate MSB and MSW To calculate MSB and MSW, we first compute the between-samples sum of squares, denoted by SSB and the within-samples sum of squares, denoted by SSW. The sum of SSB and SSW is called the total sum of squares and is denoted by SST; that is, SST = SSB + SSW The values of SSB and SSW are calculated using the following formulas. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Between- and Within-Samples Sums of Squares The between-samples sum of squares, denoted by SSB, is calculates as Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Between- and Within-Samples Sums of Squares The within-samples sum of squares, denoted by SSW, is calculated as Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Let us return to the example Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

∑x = T1 + T2 + T3 = 324+369+388 = 1081 n = n1 + n2 + n3 = 5+5+5 = 15 = 80,709 Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Substitute all the values in the formula for SSB, SSW and SST Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Calculating the Values of MSB and MSW MSB and MSW are calculated as where k – 1 and n – k are, respectively, the df for the numerator and the df for the denominator for the F distribution. Remember, k is the number of different samples. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Draw the ANOVA Table Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

ANOVA Table for the Example Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Back to the question The scores of 15 form one students who were randomly assigned to three groups in order to experiment with three different methods of teaching Mathematics. At the 1% significance level, can we reject the null hypothesis that the mean Mathematics score of all fourth-grade students taught by each of these three methods is the same? Assume that all the assumptions required to apply the one-way ANOVA procedure hold true.

Solution Step 1: H0: μ1 = μ2 = μ3 (The mean scores of the three groups are all equal) H1: Not all three means are equal Step 2: Because we are comparing the means for three normally distributed populations, we use the F distribution to make this test.

A one-way ANOVA test is always right-tailed Step 3: α = .01 A one-way ANOVA test is always right-tailed Area in the right tail is .01 df for the numerator = k – 1 = 3 – 1 = 2 df for the denominator = n – k = 15 – 3 = 12 The required value of F is 6.93 Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Critical value of F for df = (2,12) and α = .01. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

The value of the test statistic F = 1.09 Steps 4 & 5: The value of the test statistic F = 1.09 It is less than the critical value of F = 6.93 It falls in the nonrejection region Hence, we fail to reject the null hypothesis We conclude that the means of the three population are equal. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Example 2 From time to time, unknown to its employees, the research department at Post Bank observes various employees for their work productivity. Recently this department wanted to check whether the four tellers at a branch of this bank serve, on average, the same number of customers per hour. The research manager observed each of the four tellers for a certain number of hours. The following table gives the number of customers served by the four tellers during each of the observed hours. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Result Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Question At the 5% significance level, test the null hypothesis that the mean number of customers served per hour by each of these four tellers is the same. Assume that all the assumptions required to apply the one-way ANOVA procedure hold true. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Solution Step 1: H0: μ1 = μ2 = μ3 = μ4 (The mean number of customers served per hour by each of the four tellers is the same) H1: Not all four population means are equal Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Step 2: Because we are testing for the equality of four means for four normally distributed populations, we use the F distribution to make the test. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

A one-way ANOVA test is always right-tailed. Step 3: α = .05. A one-way ANOVA test is always right-tailed. Area in the right tail is .05. df for the numerator = k – 1 = 4 – 1 = 3 df for the denominator = n – k = 22 – 4 = 18 Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Critical value of F for df = (3, 18) and α = .05. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Step 4: Σx = T1 + T2 + T3 + T4 =108 + 87 + 93 + 110 = 398 n = n1 + n2 + n3 + n4 = 5 + 6 + 6 + 5 = 22 Σx² = (19)² + (21)² + (26)² + (24)² + (18)² + (14)² + (16)² + (14)² + (13)² + (17)² + (13)² + (11)² + (14)² + (21)² + (13)² + (16)² + (18)² + (24)² + (19)² + (21)² + (26)² + (20)² = 7614 Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Substitute all the values for formulas SSB,SSW Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

ANOVA Table Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

The value for the test statistic F = 9.69 Step 5: The value for the test statistic F = 9.69 It is greater than the critical value of F = 3.16 It falls in the rejection region Consequently, we reject the null hypothesis We conclude that the mean number of customers served per hour by each of the four tellers is not the same. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved

Significance of mean effect When there is a significant difference a post hoc statistic is performed. “post hoc” is a short version of the Latin phrase that translates to “ after this, therefore because of this.” The post hoc test consist of pair wise comparisons that are designed to compare all different combinations of the treatment groups. It takes every pair of groups and perform a t test on each pair of groups.

Post hoc results in SPSS SPSS was used to perform a post hoc test on the results of the previous example. The F test revealed difference among the four groups. The results of the post hoc are as follows.

Teller Mean Difference Std. Error Sig 95% confidence level Lower Bound Upper Bound Teller A Teller B Teller C Teller D 7.100* 6.100* -.400 1.795 1.895 .005 .015 .995 2.03 1.03 -5.70 12.17 11.17 4.90 Teller B Teller A -7.100* -1.000 -7.500* 1.712 .936 .003 -12.17 -5.84 -12.57 -2.03 3.84 -2.43 Teller C Teller A Teller B -6.100* 1.000 -6.500* 1.995 .010 -11.17 -3.84 -11.57 -1.03 5.84 -1.43 Teller D Teller A .400 7.500* 6.500* 1.875 .996 -4.90 2.43 1.43 5.79 12.57 11.57 * Mean difference is significant at the .05 level

Tukey HSD This test display subsets of groups that have the same means. The Tukey test creates two subsets of groups with statistically similar means. Teller N Subset 1 2 A B C D Sig 6 5 14.50 15.50 .943 21.60 22.00 .996

In Class exercise