Analysis of Variance ST 511 Introduction n Analysis of variance compares two or more populations of quantitative data. n Specifically, we are interested.

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 10 The Analysis of Variance.
Advertisements

Ch 14 實習(2).
Lecture 11 One-way analysis of variance (Chapter 15.2)
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Part I – MULTIVARIATE ANALYSIS
Analysis of Variance Chapter Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Comparing Means.
Chapter 3 Analysis of Variance
Analysis of Variance Chapter Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
Lecture 13 Multiple comparisons for one-way ANOVA (Chapter 15.7)
Analysis of Variance Chapter 15 - continued Two-Factor Analysis of Variance - Example 15.3 –Suppose in Example 15.1, two factors are to be examined:
Lecture 10 Inference about the difference between population proportions (Chapter 13.6) One-way analysis of variance (Chapter 15.2)
Chapter 17 Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Chapter 12: Analysis of Variance
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
QNT 531 Advanced Problems in Statistics and Research Methods
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design.
1 1 Slide Analysis of Variance Chapter 13 BA 303.
Analysis of Variance Chapter 12 Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
More About Significance Tests
Analysis of Variance ( ANOVA )
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Economics 173 Business Statistics Lectures 9 & 10 Summer, 2001 Professor J. Petry.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Chapter 15 Analysis of Variance ( ANOVA ). Analysis of Variance… Analysis of variance is a technique that allows us to compare two or more populations.
1 Inference about Two Populations Chapter Introduction Variety of techniques are presented to compare two populations. We are interested in:
1 Analysis of Variance Chapter 14 2 Introduction Analysis of variance helps compare two or more populations of quantitative data. Specifically, we are.
1 Nonparametric Statistical Techniques Chapter 17.
Lecture 9-1 Analysis of Variance
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd.. 1 Slide Slide Slides Prepared by Juei-Chao Chen Fu Jen Catholic University Slides Prepared.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 Pertemuan 19 Analisis Varians Klasifikasi Satu Arah Matakuliah: I Statistika Tahun: 2008 Versi: Revisi.
Rancangan Acak Lengkap ( Analisis Varians Klasifikasi Satu Arah) Pertemuan 16 Matakuliah: I0184 – Teori Statistika II Tahun: 2009.
Keller: Stats for Mgmt & Econ, 7th Ed Analysis of Variance
Statistics Analysis of Variance.
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
Chapter 14: Analysis of Variance One-way ANOVA Lecture 8
Chapter 10 – Part II Analysis of Variance
Week ANOVA Four.
Presentation transcript:

Analysis of Variance ST 511

Introduction n Analysis of variance compares two or more populations of quantitative data. n Specifically, we are interested in determining whether differences exist between the population means. n The procedure works by analyzing the sample variance.

§1 One Way Analysis of Variance n The analysis of variance is a procedure that tests to determine whether differences exist between two or more population means. n To do this, the technique analyzes the sample variances

One Way Analysis of Variance: Example n A magazine publisher wants to compare three different styles of covers for a magazine that will be offered for sale at supermarket checkout lines. She assigns 60 stores at random to the three styles of covers and records the number of magazines that are sold in a one-week period.

One Way Analysis of Variance: Example n How do five bookstores in the same city differ in the demographics of their customers? A market researcher asks 50 customers of each store to respond to a questionnaire. One variable of interest is the customer’s age.

Idea Behind ANOVA – two types of variability 1.Within group variability 2.Between group variability

Treatment 1Treatment 2 Treatment Treatment 1 Treatment 2Treatment The sample means are the same as before, but the larger within-sample variability makes it harder to draw a conclusion about the population means. A small variability within the samples makes it easier to draw a conclusion about the population means.

Idea behind ANOVA: recall the two-sample t-statistic n Difference between 2 means, pooled variances, sample sizes both equal to n n Numerator of t 2 : measures variation between the groups in terms of the difference between their sample means n Denominator: measures variation within groups by the pooled estimator of the common variance. n If the within-group variation is small, the same variation between groups produces a larger statistic and a more significant result.

One Way Analysis of Variance: Example n Example 1 –An apple juice manufacturer is planning to develop a new product -a liquid concentrate. –The marketing manager has to decide how to market the new product. –Three strategies are considered F Emphasize convenience of using the product. F Emphasize the quality of the product. F Emphasize the product’s low price.

One Way Analysis of Variance n Example 1 - continued –An experiment was conducted as follows: F In three cities an advertisement campaign was launched. F In each city only one of the three characteristics (convenience, quality, and price) was emphasized. F The weekly sales were recorded for twenty weeks following the beginning of the campaigns.

One Way Analysis of Variance Weekly sales

One Way Analysis of Variance n Solution –The data are quantitative –The problem objective is to compare sales in three cities. –We hypothesize that the three population means are equal

H 0 :  1 =  2 =  3 H 1 : At least two means differ To build the statistic needed to test the hypotheses use the following notation: Solution Defining the Hypotheses

Independent samples are drawn from k populations (treatment groups). 12k X 11 x 21. X n1,1 X 12 x 22. X n2,2 X 1k x 2k. X nk,k Sample size Sample mean First observation, first sample Second observation, second sample X is the “response variable”. The variables’ values are called “responses”. Notation

Terminology n In the context of this problem… Response variable – weekly sales Responses – actual sale values Experimental unit – weeks in the three cities when we record sales figures. Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy. Factor levels – the population (treatment) names. In this problem factor levels are the 3 marketing strategies: 1) convenience, 2) quality, 3) price

Two types of variability are employed when testing for the equality of the population means The rationale of the test statistic 1.Within sample variability 2.Between sample variability

The rationale behind the test statistic – I n If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). n If the alternative hypothesis is true, at least some of the sample means would differ. Thus, we measure variability between sample means. H 0 :  1 =  2 =  3 H 1 : At least two means differ

This sum is called the Sum of Squares for Groups SSG The variability between the sample means is measured as the sum of squared distances between each mean and the grand mean. In our example, treatments are represented by the different advertising strategies. Variability between sample means

There are k treatments The size of sample j The mean of sample j Sum of squares for treatment groups (SSG) Note: When the sample means are close to one another, their distance from the grand mean is small, leading to a small SSG. Thus, large SSG indicates large variation between sample means, which supports H 1.

Sum of squares for treatment groups (SSG) n Solution – continued Calculate SSG = 20( ) ( ) ( ) 2 = = 57, The grand mean is calculated by

Sum of squares for treatment groups (SSG) Is SSG = 57, large enough to reject H 0 in favor of H 1 ?

The rationale behind test statistic – II n Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. n Therefore, even though sample means may markedly differ from one another, SSG must be judged relative to the “within samples variability”.

This sum is called the Sum of Squares for Error SSE Within samples variability n The variability within samples is measured by adding all the squared distances between observations and their sample means. In our example, this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities).

Sum of squares for errors (SSE) n Solution – continued Calculate SSE  (n 1 - 1)s (n 2 -1)s (n 3 -1)s 3 2 = (20 -1)10,775 + (20 -1)7, (20-1)8, = 506,983.50

Sum of squares for errors (SSE) Is SSG = 57, large enough relative to SSE = 506, to reject the null hypothesis that specifies that all the means are equal?

mean squares To perform the test we need to calculate the mean squares as follows: The mean sum of squares Calculation of MSG - Mean Square for treatment Groups Calculation of MSE Mean Square for Error

Calculation of the test statistic with the following degrees of freedom: v 1 =k -1 and v 2 =n-k Required Conditions: 1. The populations tested are normally distributed. 2. The variances of all the populations tested are equal.

And finally the hypothesis test: H 0 :  1 =  2 = …=  k H 1 : At least two means differ Test statistic: R.R: F>F ,k-1,n-k The F test rejection region

The F test H o :  1 =  2 =  3 H 1 : At least two means differ Test statistic F= MSG  MSE= 3.23 Since 3.23 > 3.15, there is sufficient evidence to reject H o in favor of H 1, and argue that at least one of the mean sales is different than the others.

The F test p- value n Use Excel to find the p-value f x Statistical F.DIST.RT(3.23,2,57) =.0469 p Value = P(F>3.23) =.0469

Excel single factor ANOVA SS(Total) = SSG + SSE

Multiple Comparisons n When the null hypothesis is rejected, it may be desirable to find which mean(s) is (are) different, and at what ranking order. n Two statistical inference procedures, geared at doing this, are presented: –“regular” confidence interval calculations –Bonferroni adjustment

Multiple Comparisons n Two means are considered different if the confidence interval for the difference between the corresponding sample means does not contain 0. In this case the larger sample mean is believed to be associated with a larger population mean. n How do we calculate the confidence intervals?

“Regular” Method n This method builds on the equal variances confidence interval for the difference between two means. n The CI is improved by using MSE rather than s p 2 (we use ALL the data to estimate the common variance instead of only the data from 2 samples)

Experiment-wise Type I error rate (the effective Type I error) n The preceding “regular” method may result in an increased probability of committing a type I error. n The experiment-wise Type I error rate is the probability of committing at least one Type I error at significance level . It is calculated by: experiment-wise Type I error rate = 1-(1 –  ) g where g is the number of pairwise comparisons (i.e. g = k C 2 = k(k-1)/2. n For example, if  =.05, k=4, then experiment-wise Type I error rate =1-.735=.265 n The Bonferroni adjustment determines the required Type I error probability per pairwise comparison (  * ), to secure a pre-determined overall .

Bonferroni Adjustment n The procedure: –Compute the number of pairwise comparisons (g) [g=k(k-1)/2], where k is the number of populations. –Set  * =  /g, where  is the true probability of making at least one Type I error (called experiment-wise Type I error). –Calculate the following CI for  i –  j

Bonferroni Method n Example - continued –Rank the effectiveness of the marketing strategies (based on mean weekly sales). –Use the Bonferroni adjustment method n Solution –The sample mean sales were , 653.0, –We calculate g=k(k-1)/2 to be 3(2)/2 = 3. –We set  * =.05/3 =.0167, thus t.0167/2, 60-3 = (Excel). –Note that s = √ = 94.31

Bonferroni Method: The Three Confidence Intervals There is a significant difference between  1 and  2.

Bonferroni Method: Conclusions Resulting from Confidence Intervals Do we have evidence to distinguish two means? n Group 1 Convenience: sample mean n Group 2 Quality: sample mean 653 n Group 3 Price: sample mean n List the group numbers in increasing order of their sample means; connecting overhead lines mean no significant difference 1 3 2

Bonferroni Method: Conclusions Resulting from Confidence Intervals Do we have evidence to distinguish two means? n Group 1 Convenience: sample mean n Group 2 Quality: sample mean 653 n Group 3 Price: sample mean n List the group numbers in increasing order of their sample means; connecting overhead lines mean no significant difference 1 3 2