Download presentation

Presentation is loading. Please wait.

1
ANOVA Single Factor Models Single Factor Models

2
ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations. Basic Question: Even if the true means of n populations were equal (i.e. we cannot expect the sample means ( x 1, x 2, x 3, x 4 ) to be equal. So when we get different values for the x’s, –How much is due to randomness? –How much is due to the fact that we are sampling from different populations with possibly different j ’s.

3
ANOVA TERMINOLOGY Response Variable (y) –What we are measuring Experimental Units –The individual unit that we will measure Factors –Independent variables whose values can change to affect the outcome of the response variable, y Levels of Factors –Values of the factors Treatments –The combination of the levels of the factors applied to an experimental unit

4
Example We want to know how combinations of different amounts of water (1 ac-ft, 3 ac-ft, 5 ac-ft) and different fertilizers (A, B, C) affect crop yields Response variable crop yield (bushels/acre) – crop yield (bushels/acre) Experimental unit –Each acre that receives a treatment (2)Factors (2) –Water and fertilizer (3 for Water; 3 for Fertilizer)Levels (3 for Water; 3 for Fertilizer) –Water: 1, 3, 5; Fertilizer: A, B, C (9 = 3x3)Treatments (9 = 3x3) –1A, 3A, 5A, 1B, 3B, 5B, 1C, 3C, 5C

5
Single Factor ANOVA Basic Assumptions If we focus on only one factor (e.g. fertilizer type in the previous example), this is called single factor ANOVA. –In this case, levels and treatments are the same thing since there are no combinations between factors. Assumptions for Single Factor ANOVA 1.The distribution of each population in the comparison has a normal distribution 2.The standard deviations of each population (although unknown) are assumed to be equal (i.e. 3.Sampling is: Random Independent

6
Example The university would like to know if the delivery mode of the introductory statistics class affects the performance in the class as measured by the scores on the final exam. The class is given in four different formats: –Lecture –Text Reading –Videotape –Internet The final exam scores from random samples of students from each of the four teaching formats was recorded.

7
Samples

8
Summary There is a single factor under observation – teaching format There are k = 4 different treatments (or levels of teaching formats) The number of observations (experimental units) are n 1 = 7, n 2 = 8, n 3 = 6, n 4 = 5 total number of observations, n = 26

9
Why aren’t all the x’s the same? Between Treatment Variability (Treatment)There is variability due to the different treatments -- Between Treatment Variability (Treatment) Within Treatment Variability (Error)There is variability due to randomness within each treatment -- Within Treatment Variability (Error) Between Treatment Variability If the average Between Treatment Variability is “large” Within Treatment Variability compared to the average Within Treatment Variability, we can reasonably conclude that there really are differences among the population means (i.e. at least one μ j differs from the others). BASIC CONCEPT

10
Basic Questions Given this basic concept, the natural questions are: –What is “variability” due to treatment and due to error and how are they measured? –What is “average variability” due to treatment and due to error and how are they measured? –What is “large”? How much larger than the observed average variability due to error does the observed average variability due to treatment have to be before we are convinced that there are differences in the true population means (the µ’s)?

11
How Is “Total” Variability Measured? Sum of Square Deviations Variability is defined as the Sum of Square Deviations (from the grand mean). So, SSTSST (Total Sum of Squares) – Sum of Squared Deviations of all observations from the grand mean. SSTrSSTr (Between Treatment Sum of Squares) –Sum of Square Deviations Due to Different Treatments SSESSE (Within Treatment Sum of Squares) –Sum of Square Deviations Due to Error SST = SSTr + SSE

12
How is “Average” Variability Measured? “Average” Variability is measured in: Mean Square Values Mean Square Values (MSTr and MSE) –Found by dividing SSTr and SSE by their respective degrees of freedom VariabilitySSDFMean Square (MS) Variability SS DF Mean Square (MS) Between Tr. (Treatment) SSTr k-1 SSTr/DF TR Within Tr. (Error) SSE n-k SSE/DF E TOTAL SST n-1 ANOVA TABLE # observations -1 # treatments -1DFT - DFTR

13
Formula for Calculating SST Calculating SST Just like the numerator of the variance assuming all (26) entries come from one population

14
Formula for Calculating SSTr Calculating SSTr Between Treatment Variability Replace all entries within each treatment by its mean – now all the variability is between (not within) treatments 76 75 65 74

15
Formula for Calculating SSE Calculating SSE (Within Treatment Variability) The difference between the SST and SSTr ---

16
Can we Conclude a Difference Among the 4 Teaching Formats? We conclude that at least one population mean differs from the others if the average between treatment variability is large compared to the average within treatment variability, that is if MSTr/MSE is “large”. F distributionF-statistic (=MSTr/MSE)The ratio of the two measures of variability for these normally distributed random variables has an F distribution and the F-statistic (=MSTr/MSE) is compared to a critical F-value from an F distribution with: –Numerator degrees of freedom = DFTr –Denominator degrees of freedom = DFE at least one population mean differs from the othersIf the ratio of MSTr to MSE (the F-statistic) exceeds the critical F-value, we can conclude that at least one population mean differs from the others.

17
Can We Conclude Different Teaching Formats Affect Final Exam Scores? The F-test H 0 : H A : At least one j differs from the others Select α =.05. Reject H 0 (Accept H A ) if:

18
Hand Calculations for the F- test Cannot conclude there is a difference among the μ j ’s

19
Excel Approach

20
EXCEL OUTPUT p-value =.365975 >.05 Cannot conclude differences

21
REVIEW ANOVA Situation and Terminology –Response variable, Experimental Units, Factors, Levels, Treatments, Error Basic Concept –If the “average variability” between treatments is “a lot” greater than the “average variability” due to error – conclude that at least one mean differs from the others. Single Factor Analysis –By Hand –By Excel

Similar presentations

© 2024 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google