Analysis of variance (ANOVA)-the General Linear Model (GLM)

Slides:



Advertisements
Similar presentations
Hypothesis testing 5th - 9th December 2011, Rome.
Advertisements

Multiple Analysis of Variance – MANOVA
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Chapter Fourteen The Two-Way Analysis of Variance.
SPSS Series 3: Repeated Measures ANOVA and MANOVA
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
ANOVA notes NR 245 Austin Troy
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Two Groups Too Many? Try Analysis of Variance (ANOVA)
Analysis of Variance & Multivariate Analysis of Variance
Assumption of Homoscedasticity
Chapter 14 Inferential Data Analysis
© Copyright 2000, Julia Hartman 1 An Interactive Tutorial for SPSS 10.0 for Windows © Factorial Analysis of Variance by Julia Hartman Next.
Repeated Measures ANOVA Used when the research design contains one factor on which participants are measured more than twice (dependent, or within- groups.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
© Copyright 2000, Julia Hartman 1 An Interactive Tutorial for SPSS 10.0 for Windows © Analysis of Covariance (GLM Approach) by Julia Hartman Next.
Chapter 12: Analysis of Variance
F-Test ( ANOVA ) & Two-Way ANOVA
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Chapter 13: Inference in Regression
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
ANOVA Analysis of Variance.  Basics of parametric statistics  ANOVA – Analysis of Variance  T-Test and ANOVA in SPSS  Lunch  T-test in SPSS  ANOVA.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
QNT 531 Advanced Problems in Statistics and Research Methods
SPSS Series 1: ANOVA and Factorial ANOVA
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
© 2003 Prentice-Hall, Inc.Chap 11-1 Analysis of Variance IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION Dr. Xueping Li University of Tennessee.
Srinivasulu Rajendran Centre for the Study of Regional Development (CSRD) Jawaharlal Nehru University (JNU) New Delhi India
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
12e.1 ANOVA Within Subjects These notes are developed from “Approaching Multivariate Analysis: A Practical Introduction” by Pat Dugard, John Todman and.
ANOVA (Analysis of Variance) by Aziza Munir
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Hypothesis testing Intermediate Food Security Analysis Training Rome, July 2010.
Inferential Statistics
6/2/2016Slide 1 To extend the comparison of population means beyond the two groups tested by the independent samples t-test, we use a one-way analysis.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Chapter 13 Multiple Regression
ANOVA: Analysis of Variance.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
11/19/2015Slide 1 We can test the relationship between a quantitative dependent variable and two categorical independent variables with a two-factor analysis.
SW318 Social Work Statistics Slide 1 One-way Analysis of Variance  1. Satisfy level of measurement requirements  Dependent variable is interval (ordinal)
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
Chapter 10 The t Test for Two Independent Samples
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
Business Statistics: A First Course (3rd Edition)
Chapter 4 Analysis of Variance
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Between Subjects Analysis of Variance PowerPoint.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Analysis of Variance STAT E-150 Statistical Methods.
Analysis of variance Tron Anders Moger
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Multivariate vs Univariate ANOVA: Assumptions. Outline of Today’s Discussion 1.Within Subject ANOVAs in SPSS 2.Within Subject ANOVAs: Sphericity Post.
BINARY LOGISTIC REGRESSION
ANOVA (Chapter - 04/D) Dr. C. Ertuna.
Comparing Three or More Means
Analysis of Covariance (ANCOVA)
An Interactive Tutorial for SPSS 10.0 for Windows©
One way ANOVA One way Analysis of Variance (ANOVA) is used to test the significance difference of mean of one dependent variable across more than two.
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Analysis of variance (ANOVA)-the General Linear Model (GLM) Kazimieras Pukėnas

INTRODUCTION Analysis of variance (ANOVA) is used to uncover the main and interaction effects of categorical independent variables (called "factors") on an interval dependent variable. The General Linear Model is "general" in the sense that one may implement both regression and ANOVA models. The GLM Univariate procedure provides regression analysis and analysis of variance for one dependent variable by one or more factors and/or variables. The factor variables divide the population into groups. Using this GLM procedure, you can test null hypotheses about the effects of other variables on the means of various groupings of a single dependent variable. You can investigate interactions between factors as well as the effects of individual factors. In addition, the effects of covariates and covariate interactions with factors can be included.

INTRODUCTION The GLM Multivariate procedure provides analysis of variance for multiple dependent variables by one or more factor variables or covariates. The GLM Repeated Measures procedure provides analysis of variance when the same measurement is made several times on each subject or case. If between-subjects factors are specified, they divide the population into groups. Using this general linear model procedure, you can test null hypotheses about the effects of both the between-subjects factors and the within-subjects factors. You can investigate interactions between factors as well as the effects of individual factors. In addition, the effects of constant covariates and covariate interactions with the between-subjects factors can be included.

GLM Univariate, one-way ANOVA One-way ANOVA tests differences in a single interval dependent variable among two, three, or more groups formed by the categories of a single categorical independent variable (factor). Data requirements: In all GLM models, the dependent(s) variable(s) X1…Xk is/are continuous. The independents may be categorical factors (including both numeric and string types) or quantitative covariates. The data are a random sample from a normal population. The variance(s) of the dependent variable(s) is/are assumed to be the same for each cell formed by categories of the factor(s). Analysis of variance is robust to departures from normality, although the data should be symmetric. To check assumptions, you can use homogeneity of variances tests.

GLM Univariate, one-way ANOVA One-way ANOVA can be very briefly in popular form explained as follows: The idea of the analysis of variance is to take a summary of the variability in all the observations and partition it into separate sources. This sum of squares total SST is partitioned into two separate, and additive, pieces. These are a sum of squares among (between), SSA and a sum of squares within, SSW ; where ; ;

GLM Univariate, one-way ANOVA - the jth observation in the ith group; - the overall mean of all samples; - the sample mean for the ith group; k - the number of independent groups (populations); ni - the size of ith group; The ratio MSA/MSW serves as a measure of the statistical importance or significance of the differences among the group means because MSA~MSW if the null hypothesis is true, i. e. (the homogeneity of variances is assumed);

GLM Univariate, one-way ANOVA The statistical hypotheses under consideration Decision rule: The null hypothesis H0 is rejected (not all means are equal) if ; The null hypothesis H0 is not rejected (there is no difference between means) if ; where is the significance level;

GLM Univariate, two-way ANOVA Two-way ANOVA analyzes one interval dependent in terms of the categories (groups) formed by two independents (factors), one of which may be conceived as a control variable, and tests the interaction of two independent variables. Data requirements are similar to one-way ANOVA: The data are a random sample from a normal population; In the population, all cell variances are the same; Analysis of variance is robust to departures from normality, although the data should be symmetric.

GLM Univariate, two-way ANOVA The two‐way ANOVA tests three hypotheses: the main effect for factor A; the main effect for factor B; interaction effect of two factors. For interval scale dependent variables with unknown means , and variance , where a – the number of categories of factor A, b – the number of categories of factor B, we can test the hypotheses: where null hypothesis H0 is that the factor A has no influence on the response variable;

GLM Univariate, two-way ANOVA where null hypothesis H0 is that the factor B has no influence on the response variable; where null hypothesis H0 assumed that there is no interaction effect of two factors; ; ; - overall mean; Each null hypothesis H0 is rejected if ;

GLM Univariate, two-way ANOVA The one-way and two-way ANOVA procedures in SPSS are performed in similar manner, therefore, we present the step-by-step instructions on how to perform a two-way ANOVA. Open the file with the data analyzed. From the menus choose: Analyze  General Linear Model  Univariate... Select a dependent variable in Univariate dialog box (Fig.1) and select variables for Fixed Factor(s), Random Factor(s), and Covariate(s), as appropriate for your data. A covariates are an interval-level independents and are commonly used as control variables to test the main and interaction effects of categorical variables on a continuous dependent variable, controlling for the effects of selected other continuous variables which covary with the dependent.

GLM Univariate, two-way ANOVA Fig. 1. Univariate dialog box

GLM Univariate, two-way ANOVA Leave default Full Factorial model in dialog box Univariate: Model, i.e. you can skip Model... and Contrasts…; Click Plots... and specify a plot by selecting factors for the horizontal axis and, optionally, factors for separate lines and separate plots in Univariate: Profile Plots dialog box (Fig. 2); the plot must be added to the Plots list. A profile plot is a line plot in which each point indicates the estimated marginal mean of a dependent variable at one level of a factor. A profile plot of one factor shows whether the estimated marginal means are increasing or decreasing across levels. For two factors, parallel lines indicate that there is no interaction between factors. Nonparallel lines indicate an interaction. Click Continue.

GLM Univariate, two-way ANOVA Fig. 2 . Univariate: Profile Plots dialog box

GLM Univariate, two-way ANOVA Click Post Hoc... to select post hos tests in Univariate: Post Hoc Multiple Comparisons for Observed Means dialog box (Fig. 3); Once you have determined that differences exist among the means and factor has more than two levels, post hoc range tests and pairwise multiple comparisons can determine which means differ. The Bonferroni and Tukey’s honestly significant difference tests are commonly used multiple comparison tests. But Bonferroni test is unappropriate when factor has multiple levels. Select the corresponding variables (factors) into the Post Hoc Tests for box, check Tukey’s test and click Continue.

GLM Univariate, two-way ANOVA Fig. 3. Univariate: Post Hoc Multiple Comparisons for Observed Means dialog box

GLM Univariate, two-way ANOVA Click Options... At the top of Univariate: Options box (Fig. 4) you cold ask for Estimated Marginal Means to be displayed, by moving the variables (factors and interactions) to the right-hand box Display Means for. This is used when you want to adjust the means to remove the effect of covariate. When you haven’t got a covariate, the Estimated Marginal Means will be the same as the means from your sample, which are displayed using the Descriptive Statistics option at the bottom of Univariate: Options... dialog box.

GLM Univariate, two-way ANOVA Fig. 4. Univariate: Options dialog box

GLM Univariate, two-way ANOVA Click the box next to Estimates of effect size. Estimates of effect size gives a Partial Eta-Squared value for each effect and each parameter estimate. The eta-squared statistic describes the proportion of total variability attributable to a factor; ; Select Observed power. Observed power is the likelihood of finding a significant difference between groups in any particular sample with the sample size as the difference between groups in the population. In other words, Observed power is the probability of correctly rejecting a false statistical null hypothesis and is equal to 1-β, where β is the probability of a Type II error. Conventionally a test with a power greater than 0.8 (or β<=0.2) is considered statistically powerful. Select Homogeneity tests. Homogeneity tests produces the Levene test of the homogeneity of variance for each dependent variable across all level combinations of the between-subjects factors, for between-subjects factors only.

Example Example. Data are gathered for individual swimmers in the senior swimming championship for several years. The time in which each swimmer finishes is the dependent variable. Other factors include date of championship, and age (categorical). You might find that age and date of championship are a significant effect and that the interaction of age with date is significant. It is suppose, that different individuals participated in different championships, i.e., the samples are independent. The data file fragment is show in Fig. 5. The following basic tables are obtained from the GLM Univariate output.

Example Fig. 5. Data View

Example Table Between-Subjects Factors (Fig. 6) contains general information about independent variables (influence factors); Levene's test of homogeneity of variance is computed by SPSS to test the GLM Univariate assumption that each group (category) of the independent(s) has the same variance. In our example, resulting p-value of Levene's test is greater than significance level (0,05) as are shown in table Levene’s Test of Equality of Error Variances (Fig.6). That is, assumptions are met. Note, that the Levene’s test is robust in the face of departures from normality.

Example Fig. 6. The main outputs of GLM Univariate

Example The Tests of Between Subjects Effects table (Fig. 7) gives us information about the main and interaction effects. This table shows that for the Age main effect, p(Sig.) = 0.000, with a Partial Eta-Squared effect size of 0.300, and Observed Power 1.000. Since p < 0.05 we reject H0. There is a significant Age main effect on the dependent variable, Time. This table also shows that for the Championship main effect, p = 0.000, with a Partial Eta-Squared effect size of 0.475, and Observed Power 1.000. Since p < 0.05 we reject H0. There is a significant Championship main effect on the dependent variable, Time. Finally, the table shows that for Age*Championship interaction, p = 0.319. Since p<0.05, we reject H0. There is significant interaction between Age and Championship.

Example Fig. 7. The main outputs of GLM Univariate

Example The table Estimated Marginal Means (Fig. 8) shows mean of dependent variable (Time) for each level of Age and Championship, along with the standard error of estimate of the mean. The Post Hoc Test Multiple Comparisons table (Fig.9) for the Tukey test displays all pairwise comparisions between groups of independent variable Age. Significant differences in Time scores were found between the age groups 20-30 years and 35-39 years, also between the age groups 30-34 years and 35-39 years. No significant difference was found between the age groups 25-29 years and 30-34 years. All comparisons are made twice, so all results are repeated.

Fig. 8. The main outputs of GLM Univariate

Example Post Hoc Tests Age Fig. 9. The main outputs of GLM Univariate

Example Also the table Homogenous Subsets (Fig. 10) shows there are two significantly different homogenous subsets. Similar results are across levels of second independent variable (factor) – Championship (not shown here).

Example Fig. 10. The main outputs of GLM Univariate Homogeneous Subsets Fig. 10. The main outputs of GLM Univariate

Example Profile plots are an easy way to visualize the relationship of factors to the dependent variable and to each other. Profile plot Estimated Marginal Means (Fig.11) shows the marginal means on the continuous dependent variable Time for value groups of factor Championship, using values of another factor Age as the X axis (the Y axis is the magnitude of the mean). That the profile plot lines are not parallel shows there is an interaction effect between Championship and Age. The fundamental difference between the nature of the curve suggests the interaction of factors - the final conclusion is based on Test of Between-Subject Effects table.

Example Fig. 11. The main outputs of GLM Univariate