Inferential Statistics Analysis of Variance – ANOVA Faculty of Information Technology King Mongkut’s University of Technology North Bangkok
Content Estimation Hypothesis testing Forming hypothesis Testing population means Testing population variances Testing categorical data / proportion Hypothesis about many population means One-way ANOVA Two-way ANOVA
Analysis of Variance (ANOVA) Test if any of multiple means are different from each other One-way ANOVA: 1 variables – 3 or more groups Dependent variable is assumed is of interval or ratio scale Also used with ordinal scale data Can describe the effect of independent variable on dependent variable Two-way ANOVA: two independent, one dependent variables MANOVA: Two or more dependent variables Can describe interaction between two independent variables
One-way ANOVA Test the means (of dependent variable) between groups as specified by an independent variable that are organized in 3 or more groups (dichotomous) Occupation: Student, Lecturer, Doctor (1 var - 3 groups) Salary: dependent variable Assumptions Dependent variable is either an interval or ratio (continuous) Dependent variable is approximately normally distributed for each category of the independent variable There is equality of variances between the independent groups (homogeneity of variances). Independence of cases.
One-way ANOVA Concept Between-Group Variance Within-Group Variance Total Variance = Between-Group Variance + Within-Group Variance Between-Group Variance Describe the difference of means between groups, which is the effect on variable of interest Within-Group Variance Describe the difference of means within each group, which is the effect caused by other factors, called Error H0 : μ1 = μ2 = μ3 = … = μn H1 : μ1 != μ2 != μ3 != … != μn (at least one different pair)
k: number of groups n: number of samples One-way ANOVA Table Source of Variance Degree of Freedom (df) Sum Square (SS) Mean Square (MS) F-ratio Between Groups (Treatment) k-1 Within Groups (Error) n-k Total n-1 SST = SSB + SSW k: number of groups n: number of samples df: degree of freedom
One-way ANOVA: SPSS Analyze -> Compare Means -> One-way ANOVA Option -> Tick… Homogeneity of variance test Descriptive (optional) Welch Post Hoc - used when the result is significant (at least one of the means is different) to find the group with the different mean https://statistics.laerd.com/spss-tutorials/one-way-anova-using-spss-statistics.php http://academic.udayton.edu/gregelvers/psy216/spss/1wayanova.htm
At least one pair is different Example Determine if the means of total score are different in the 5 Sections H0 : μ1 = μ2 = μ3 = μ4 = μ5 H1 : μ1 != μ2 != μ3 != μ4 != μ5 At least one pair is different
Result: Descriptives and Variances Check Levene test “Sig.” > = 0.05, thus variances are equal in all groups If not, need to refer to the Robust Tests of Equality of Means Table (Welch) instead of the ANOVA Table
Result: ANOVA Table Sig. = 0.013 < α, thus at least one of the group has different means Use Post-Hoc tests To find the pair with different mean
Result: Post Hoc Tests The pair that Sig. < α has different mean Section 1 and 4 Section 2 and 4 Section 2 and 5 Section 3 and 4 Section 4 and 5
Two-way ANOVA Use to determine the effect of 2 or more factors (independent variables) on one dependent variable Occupation: Student, Lecturer, Doctor Age: less than 20, 20-30, 31-40, 41 or older Salary: dependent variable Assumptions Dependent variable is either interval or ratio (continuous) The dependent variable is approximately normally distributed for each combination of levels of the two independent variables Homogeneity of variances of the groups formed by the different combinations of levels of the two independent variables. Independence of cases
Two-way ANOVA Concept Two-way ANOVA compares Means between columns Means between rows Means from the interaction of factors Sum Square Row (SSR): variation effect of the 1st factor Sum Square Column (SSC): variation effect of the 2nd factor Sum Square Row Column (SSRC): variation effect of the interaction of the two factors Sum Square Error (SSE): Error caused by external factors Sum Square Total (SST) = SSR + SSC + SSRC + SSE
Two-way ANOVA Table r: number of rows c: number of columns n: number of samples df: degree of freedom
Two-way ANOVA: SPSS Analyze -> General Linear Model -> Univariate Multivariate is MANOVA Add dependent variable and two or more factors (independent variables) Option -> tick “Homogeneity tests” (optional “Descriptive”) Plot -> add one factor (containing more groups) to “Horizontal Axis” and other to “Separate Lines” then click “Add” To obtain profile plot Post Hoc to find pair that has different means (similar to One-way ANOVA, optional) https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php
Example Determine the effect of major and gender on the total score
Result Compare Error to Corrected Total Error should be less than 20% of corrected total Error is very large compared to corrected total Total score is effected by other external factors Gender row Sig. = 0.024 < α, gender has effect on total score Major row Sig. = 0.575 > α, major has no effect on total score Major*Gender row Sig. = 0.298 > α, the interaction between two factors has no effect on total score
Result: Profile Plot
Example Determine the effect of section and gender on the total score