The basic idea So far, we have been comparing two samples

Slides:



Advertisements
Similar presentations
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Advertisements

Multiple-choice question
Testing Differences Among Several Sample Means Multiple t Tests vs. Analysis of Variance.
Independent Sample T-test Formula
Lecture 10 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Analysis of Variance: ANOVA. Group 1: control group/ no ind. Var. Group 2: low level of the ind. Var. Group 3: high level of the ind var.
Analysis of Variance: Inferences about 2 or More Means
Statistics Are Fun! Analysis of Variance
Chapter 3 Analysis of Variance
PSY 307 – Statistics for the Behavioral Sciences
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
Anthony J Greene1 ANOVA: Analysis of Variance 1-way ANOVA.
Statistics for the Social Sciences
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Introduction to Analysis of Variance (ANOVA)
Chapter 9: Introduction to the t statistic
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
PS 225 Lecture 15 Analysis of Variance ANOVA Tables.
Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.
ANOVA Greg C Elvers.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
Lecturer’s desk INTEGRATED LEARNING CENTER ILC 120 Screen Row A Row B Row C Row D Row E Row F Row G Row.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
Sociology 5811: Lecture 14: ANOVA 2
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
One-way Analysis of Variance 1-Factor ANOVA. Previously… We learned how to determine the probability that one sample belongs to a certain population.
ANOVA (Analysis of Variance) by Aziza Munir
Testing Hypotheses about Differences among Several Means.
Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.
Statistics for the Social Sciences Psychology 340 Fall 2012 Analysis of Variance (ANOVA)
One-Way ANOVA ANOVA = Analysis of Variance This is a technique used to analyze the results of an experiment when you have more than two groups.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
CRIM 483 Analysis of Variance. Purpose There are times when you want to compare something across more than two groups –For instance, level of education,
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
ONE WAY ANALYSIS OF VARIANCE ANOVA o It is used to investigate the effect of one factor which occurs at h levels (≥3). Example: Suppose that we wish to.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
Statistics for Political Science Levin and Fox Chapter Seven
Research Methods and Data Analysis in Psychology Spring 2015 Kyle Stephenson.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
EDUC 200C Section 9 ANOVA November 30, Goals One-way ANOVA Least Significant Difference (LSD) Practice Problem Questions?
Chapter 9 Introduction to the Analysis of Variance Part 1: Oct. 22, 2013.
While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 121.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.
Chapter 14: Analysis of Variance One-way ANOVA Lecture 9a Instructor: Naveen Abedin Date: 24 th November 2015.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.
Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:
i) Two way ANOVA without replication
Analysis of Variance (ANOVA)
Chapter 14: Analysis of Variance One-way ANOVA Lecture 8
Chapter 13 Group Differences
Statistics for the Social Sciences
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Elementary Statistical Methods Lecture 24 One-way Analysis of Variance André L. Souza, Ph.D. The University of Alabama www.andreluizsouza.com

The basic idea So far, we have been comparing two samples boys vs. girls drinkers vs. non-drinkers beautiful vs. ugly What if we want to compare more than two samples? For example: Supposed I want to investigate whether study time influences student’s grades. In other words, I want to know if studying two hours vs. four hours vs. six hours significantly change students’ grades. One way to investigate this would be to conduct three separate experiments. Two hours vs. four hours Two hours vs. six hours Four hours vs. six hours Why is that not a good idea?

The basic idea A better way to investigate my hypothesis is to perform one single experiment that tests all three groups at the same time Why not carry separate t-tests for each pairwise comparison? The more groups you have, the more t-tests you will need Each t-test is associated with an α–level (probability of making a Type I error) Multiple t-tests will increase the overall probability of Type I error A common method used to compare several means while keeping the Type I error rate small is known as Analysis of Variance ANOVA is a statistical technique for testing/comparing differences in the means of several groups

Analysis of Variance Analysis of Variance is probably the most used statistical technique in psychological research If the analysis of variance uses only one independent variable (i.e., just one explanatory variable), it is known as one-way Analysis of Variance (or One-Way ANOVA) Influence of hours of study (independent variable) on student’s grades (dependent variable) Influence of types of beer (independent variable) on people’s perceptions of beauty (dependent variable) Influence of a person’s origin (independent variable) on his/her mate preferences (dependent variable) The groups that make up the independent variable are known as levels of that variable Hours of study (two, four and six) Types of beer (Budweiser, Corona, Heineken) Person’s origin (Alabama, Texas, Michigan)

Analysis of Variance Analysis of Variance will focus on the variance of each sample Variance indicates how far a set of number is spread out Variance is the standard deviation squared The fundamental strategy in ANOVA is: we will take the total variance of all the scores and split it into two parts: variance caused by the independent variable and variance caused by chance (error variance). We then form a ratio between these two variances. If this ratio is significantly bigger than one, then the variation due to the independent variable is significantly higher

Variance Partitioning Total Variability Variability caused by IV Variability due to chance

Hypotheses in ANOVA H0 = μ1 = μ2 = μ3 H1 = not H0 What would be the null (H0) and alternative (H1) hypotheses for Analysis of Variance? Similarly to the two samples case, the null hypothesis will state that the means for all groups are the same H0 = μ1 = μ2 = μ3 The alternative hypothesis is a bit trickier. We are interested in any possibility in which any mean is different from any other mean. H1 = not H0

Hours of Study Example Two Hours Four Hours Six Hours 60 71 83 58 77 79 57 73 85 61 69 90 68 75 93 How many means can you calculate from this dataset? The mean grade for the two-hours group The mean grade for the four-hours group The mean grade for the six-hour group And a new mean called the Grand Mean, which is the mean of all the numbers together

Sum of Squares In Analysis of Variance, we measure total variability using Sum of Squares SS is the sum of all the squared deviations from the mean In One-way ANOVA there are three different types of Sum of Squares Total Sum of Squares (SST) Treatment Sum of Squares (SSA) Error Sum of Squares (SSE)

Total Sum of Squares This represents the total variability regardless of any specific treatment In the hours of study example, the SST is the amount that the grades vary regardless of how many hours the person studied You take each grade, subtract from it the Grand Mean, square the result and add them all up Two Hours Four Hours Six Hours 60 71 83 58 77 79 57 73 85 61 69 90 68 75 93 GM = 73.26 SST = (60 - 73.26)2 + (58 – 73.26)2 + … + (93 – 73.26)2 SST = 1826.93

Treatment Sum of Squares This represents the variability between treatment means Each group has its own mean. And this mean is far from the grand mean by a certain amount SSA measures this variability You replace each score by its respective mean, subtract from it the Grand Mean, square the result and add them all up Two Hours Four Hours Six Hours 60 71 83 58 77 79 57 73 85 61 69 90 68 75 93

Treatment Sum of Squares This represents the variability between treatment means Each group has its own mean. And this mean is far from the grand mean by a certain amount SSA measures this variability You replace each score by its respective mean, subtract from it the Grand Mean, square the result and add them all up Two Hours Four Hours Six Hours 60.8 73 86 GM = 73.26 SSA = (60.8 - 73.26)2 + (60.8 – 73.26)2 +(73 – 73.26)2 + … + (86 – 73.26)2 SSA = 1584.38

Error Sum of Squares This represents the variability within each treatment mean Each group has its own mean. Each score in that groups deviates from its specific mean by a certain amount SSE measures this variability You take each score, subtract from it the mean of its group, square the result and add them all up. Then add the individual SS of each group Mean for Two-hours group = 60.8 Mean for Four-hours group = 73 Mean for Six-hours group = 86 SSE = (60.8 – 60.8)2 + (71 – 73)2 + … + (83 – 86)2 SSE = 242.55 Two Hours Four Hours Six Hours 60 71 83 58 77 79 57 73 85 61 69 90 68 75 93

Variance (SS) partitioning You should have noticed that the SSA + SSE = SST This means that the Total Variability can be partitioned into variability that can be attributed to the treatment group (i.e., whatever makes the groups different) and variability that cannot be attributed to treatment groups (variability due to chance) Now, think about the null hypothesis for a second. If there is no difference between the groups, we should expect the variability within each group to be the same across (between) groups But because this two variabilites (between and within) are not “equally” represented, we need to take the average of them

Mean Squares To obtain the average deviation, we divide the SS by degrees of freedom to obtain what we call mean square The SSA is represented by the number of groups in the experiment Then to average the SSA, we need to divide the SSA by the degrees of freedom related to the number of groups (number of groups – 1) The SSE is represented by the number of people in each group Then, to average the SSE, we need to divide the SSE by the degrees of freedom related to the number of people in each group (number of people in each group -1 times the number of groups)

MSA and MSE ratio Remember that MSE represents the average variation within each group. If the groups are the same, we expect this variation to be the same for all groups Then we expect the variation between groups to be the same as the variation between groups (MSA) If MSA and MSE are the same, we expect the ratio MSA/MSE to be 1 If MSA is larger than MSE, then we expect the ratio MSA/MSE to be larger than 1 If MSA is smaller than MSE, then we expect the ratio MSA/MSE to be smaller than 1 Two Hours Four Hours Six Hours 60

Analysis of Variance Statistical technique utilized to look for differences between more than two groups It splits the total variability in the dataset into two parts Variability caused by treatments Variability caused by chance If there is no difference between the groups, these two amounts of variability should be the same This is the logic behind Analysis of Variance

ANOVA Table Source SS df MS F Treatment Error Total The ANOVA table is a summary of the SS and MS for a given experiment The equality of the variances is tested in terms of the ratio between MSA and MSE This ratio (ratio between variances) is represented in terms of F

F-statistic The F-statistic is obtained by dividing MSA by MSE In Analysis of Variance, we reject H0 only if the computed value of F is significantly greater than 1 How much larger than 1 the value for the F needs to be before we decide to reject H0? If H0 is true, F is distributed as the F distribution It will have dfA and dfE degrees of freedom Similarly to what we have done to find t-critical, to find the F-critical, we need to look up this value at the F-table If the F-statistic is larger than the F-critical, then we reject H0 that states that all the means are the same

F-Table

Example I suspect that the brand of cellphone you have affects the amount of texts you send per day To test this, I have randomly selected 5 users of Galaxy S5 5 users of Nexus 6 5 users of iPhone 6

Example Galaxy S5 Nexus 6 iPhone 6 10 94 33 12 44 21 15 34 23 69 18 32 77 If the cellphone model does not affect the number of texts a person sends a day, then we would expect: H0 = μGalaxy = μNexus = μiPhone If the cellphone model does influence then: H1 = not H0

Total Sum of Squares GM = 34.33 Galaxy S5 Nexus 6 iPhone 6 10 94 33 12 44 21 15 34 23 69 18 32 77 GM = 34.33 SST = (10 – 34.33)2 + (12 – 34.33)2 +… + (10 – 34.33)2 SST = 9441.33

Treatment Sum of Squares Galaxy S5 Nexus 6 iPhone 6 10 94 33 12 44 21 15 34 23 69 18 32 77

Treatment Sum of Squares Galaxy S5 Nexus 6 iPhone 6 18.4 63.6 21 GM = 34.33 SSA = (18.4 – 34.33)2 + (63.6 – 34.33)2 +… + (21 – 34.33)2 SSA = 6440.9

Error Sum of Squares Galaxy = 18.4 Nexus = 63.9 iPhone = 21 Galaxy S5 Nexus 6 iPhone 6 10 94 33 12 44 21 15 34 23 69 18 32 77 Galaxy = 18.4 Nexus = 63.9 iPhone = 21 SSE = (10 – 18.4)2 + (94 – 63.6)2 +… + (33 – 21)2 SSE = 3000.4

ANOVA Table The F-critical for this test is F(2,12) = 3.89 (α = 0.05) Source SS df MS F Cell Phone 6440.9 2 3220.5 12.88 Error 3000.4 12 250.03 Total 9441.3 14 The F-critical for this test is F(2,12) = 3.89 (α = 0.05) If the F-statistic is larger than the F-critical, then we reject H0 = μGalaxy = μNexus = μiPhone If the F-statistic is not larger than F-critical, we fail to reject H0 Because 12.88 is larger than 3.89, we reject H0 and conclude that cellphone model does influence the amount of texts people send per day

Elementary Statistical Methods Lecture 24 Thank you! André L. Souza, Ph.D. The University of Alabama www.andreluizsouza.com