Topic 12 – Further Topics in ANOVA

Slides:



Advertisements
Similar presentations
Statistical Techniques I EXST7005 Multiple Regression.
Advertisements

Issues in factorial design
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Analysis of Variance The contents in this chapter are from Chapter 15 and Chapter 16 of the textbook. One-Way Analysis of Variance Multiple Comparisons.
1 CLG Handout Problem #1 (Examining interaction plots)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Analysis of frequency counts with Chi square
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
Part I – MULTIVARIATE ANALYSIS
Lecture 9: One Way ANOVA Between Subjects
Chapter 10 - Part 1 Factorial Experiments.
Handling Categorical Data. Learning Outcomes At the end of this session and with additional reading you will be able to: – Understand when and how to.
Stat Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at (skip figures 1.2 and 1.3, last.
Incomplete Block Designs
Analysis of Variance & Multivariate Analysis of Variance
Today Concepts underlying inferential statistics
2x2 BG Factorial Designs Definition and advantage of factorial research designs 5 terms necessary to understand factorial designs 5 patterns of factorial.
Two-Way Balanced Independent Samples ANOVA Overview of Computations.
Measures of Central Tendency
Chapter 12: Analysis of Variance
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Chapter 14Prepared by Samantha Gaies, M.A.1 Chapter 14: Two-Way ANOVA Let’s begin by reviewing one-way ANOVA. Try this example… Does motivation level affect.
Chapter 13: Inference in Regression
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Moderation & Mediation
Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal distributions to evaluate this finding: The study, published.
Essential Statistics Chapter 131 Introduction to Inference.
Sums of Squares. Sums of squares Besides the unweighted means solution, sums of squares can be calculated in various ways depending on the situation and.
Two-Way Balanced Independent Samples ANOVA Computations.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 5: Choices for Longitudinal Data Analysis.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Questions to Ask Yourself Regarding ANOVA. History ANOVA is extremely popular in psychological research When experimental approaches to data analysis.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009.
1.1 Analyzing Categorical Data Pages 7-24 Objectives SWBAT: 1)Display categorical data with a bar graph. Decide if it would be appropriate to make a pie.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Smith/Davis (c) 2005 Prentice Hall Chapter Fifteen Inferential Tests of Significance III: Analyzing and Interpreting Experiments with Multiple Independent.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Producing Data: Experiments BPS - 5th Ed. Chapter 9 1.
1 G Lect 13b G Lecture 13b Mixed models Special case: one entry per cell Equal vs. unequal cell n's.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
1 Topic 14 – Experimental Design Crossover Nested Factors Repeated Measures.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
1 G Lect 10M Contrasting coefficients: a review ANOVA and Regression software Interactions of categorical predictors Type I, II, and III sums of.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Chapter 14 Introduction to Multiple Regression
CHAPTER 9 Testing a Claim
Experiment Basics: Designs
Lecture Slides Elementary Statistics Twelfth Edition
Warm Up Check your understanding P. 586 (You have 5 minutes to complete) I WILL be collecting these.
Comparing Three or More Means
Multiple Regression Analysis and Model Building
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 10: Analysis of Variance: Comparing More Than Two Means
CHAPTER 9 Testing a Claim
Joanna Romaniuk Quanticate, Warsaw, Poland
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 9 Testing a Claim
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

Topic 12 – Further Topics in ANOVA Unequal Cell Sizes (Chapter 20)

Overview We’ll start with the Learning Activity. More practice in interpreting ANOVA results; and a baby-step into 3-way ANOVA. An illustration of the problems that an unbalanced design will cause. We’ll then continue with a discussion of unbalanced designs (Chapter 20)

Collaborative Learning Activity Take your time going through this. Ask questions as needed!

Analyze the design elements. Question 1 Analyze the design elements.

Design Chart Unequal Cell Sizes – but there is SOME balance achieved Single Factor Analyses will be balanced. Gender*Age = 6 observations per cell Time*Age = 6 observations per cell Gender*Time = Unbalanced

Question 2 Analyze Age*Time

Interaction Plot (ignoring gender)

Interpretations No interaction is evident between age and time Seems middle age group gets generally higher offers. Seems offers during the week are generally higher than on the weekend (this effect is not as big as the age effect)

Main Effects Plots

ANOVA Type I vs Type III?

LSMeans #3 (middle aged, weekday) is the highest Using Tukey comparisons it is significantly higher than all others. “Slicing” will show the same things that we guessed from the plots.

LSMeans (sliced)

Slicing of LSMeans Sums of Squares add to??? DF add to??? Effect of slicing is to look at differences for one of the two factors at a specific level of the other factor. Interpretations???

Question 3 Analyze Age*Gender

Interaction Plot (ignoring Time)

Interpretations Small interaction is seen; might be described as follows: There is still a clear main effect: Middle aged get higher offers in general There seem to be no gender differences for middle aged or young. For elderly, women may be getting lower offers than men.

LSMeans (sliced comparisons)

ANOVA / LSMeans Only age differences show up in the ANOVA. “Sliced” LSMeans comparisons do pick up gender difference within elderly Note: Type I error rate is uncontrolled. But on the other hand sample sizes are also fairly small. Conclusions?

Question 4 Analyze Time*Gender

Interaction Plot (ignoring age)

Interpretations Seems to be a clear interaction: For men, there is not much difference in the offer between weekday/weekend. Women should go on the weekdays, where it seems they average about $400 more. Interestingly, significance is not seen in the ANOVA table, but is seen in the ‘sliced’ LSMeans output. Remember Type I Error is uncontrolled.

ANOVA Table Why are Type I / Type III SS different here?

Sliced LSMeans

Conclusions This is an intriguing example, because the ANOVA output would lead you to believe there is a small time effect, but no gender effect. Looking at the interaction plot presents a completely different picture (and likely a more accurate one). Let’s reconsider that, showing the sample sizes.

Interaction Plot (ignoring age)

Confounding This picture illustrates how the effects of gender and time will be confounded. Suppose that women do get lower offers than men in general. Then because the women received more weekend offers (and men more offers on weekdays), the average offer on the weekend will by default be lower than the weekday. Simple example: Suppose men get $2 and women get $1. Then with the sample sizes, the weekday average will be 30/18 while the weekend average will be only 24/18.

3-way ANOVA Is Gender Important? Questions 5 & 6 3-way ANOVA Is Gender Important?

Modeling Removing unimportant terms (starting at the interaction level) seems like a reasonable way to go. Use Type III SS to do this since cell sizes are not the same. The procedure leads to a model containing only Age and Time; suggesting that gender is unimportant. But we know this may not be accurate since gender/time are confounded.

Confounding What exactly does it mean to say that the time/gender effects are confounded. The biggest thing that it means is that the analysis we just did is inappropriate since... The time effect may have been seen because more women went on the weekend. It may well be a gender effect that is disguised as a time effect due to the unbalanced design. Due to the lack of balance – we were forced to use Type III SS which (due to collinearity / confounding may not tell the whole story).

Importance of Gender? Probably! Direct algorithmic analysis suggests both time and age are important, while gender is not. But due to confounding, that wasn’t really appropriate. The plot for time*gender indicates what is probably the real story (due to small sample sizes it is hard to get significance). With a balanced design – we would be much better off. The effects would not be confounded, and we could therefore see an accurate picture.

Importance of Gender? (2) Differing sample sizes means that Estimates for women on weekdays, and men on weekends, will have larger standard errors. This will reduce our power to detect differences, and the effects will “overlap” to some extent because of the unequal sample sizes. When we looked at the gender*time interaction, the plot suggested there was an important one. Further studies should be conducted to determine if this is the case.

Unbalanced Two-Way ANOVA Unequal Cell Sizes (Chapter 20 – skim only)

Differing Cell Sizes Encountered for a variety of reasons including: Convenience – usually if we have an observational study, we have very little control over the cell sizes. Cost Effectiveness – sometimes the cost of samples is different, and we may use larger sample sizes when the cost is less. Accidently – In experimental studies, you may start with a balanced design, but lose that balance if some problem occurs.

Differing Cell Sizes (2) What changes? Loss of balance brings “intercorrelation” among the predictors. Type I and III SS will be different; typically Type III SS should be used for testing but as we have seen even that is not perfect! Standard errors for cell means and for multiple comparisons will be different (they depend on the cell size). For the same reason, confidence intervals will have different widths.

Example Examine the effects of gender (A) and anxiety level (B) on a toxin level in the bloodstream. Three categories of anxiety (Severe, Moderate, and Mild). We categorize people on this basis after they are in the study (it is an observational factor). For cost effectiveness, we wouldn’t want to throw away data just to keep a balanced design.

Data

Interaction Plot

Interpretation Effect seems to be greater if anxiety is more severe. This is an interaction of the “enhancement type”. The effect of anxiety level on toxin levels is greater for women than it is for men. Remember, we aren’t saying anything about significance here – we’ll do that when we look at the ANOVA.

ANOVA Output

Type I / III SS

Differences in Type I / III SS The more unbalanced the design, the further apart these may be. There are actually four types of SS: I – Sequential II – Added Last (Observation) III – Added Last (Cell) IV – Added Last (Empty Cells)

Type I SS Sequential Sums of Squares; Most appropriate for equal cell sizes. SS(A), SS(B|A), SS(A*B|A,B) Each observation is weighted equally. So the net result for an unbalanced design is that some treatments will be considered with greater weight than others.

Type II SS Variable Added Last SS; Generally only used for regression because again each observation is weighted equally. SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

Type III SS Variable Added Last SS, appropriate for unequal cell sizes. Type III SS adjusts for the fact that cell sizes are different. Each cell is weighted equally, with the result that treatments are weighted equally. This means that observations in “smaller” cells will carry more weight. SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

Type IV SS Variable Added Last SS and similar to Type III SS but further allows for the possibility of empty cells. It is only necessary to use these if there are empty cells (which hopefully there won’t be if you’ve designed the experiment well). SS(A|B,A*B), SS(B|A,A*B), SS(A*B|A,B)

General Strategy Remember that Type I SS and Type III SS examine different null hypotheses. Type III SS are preferred when sample sizes are not equal, but can be somewhat misleading if sample sizes differ greatly. Type IV SS are appropriate if there are empty cells. Can obtain Type IV SS if necessary by using /ss4 in MODEL statement

Example (continued) The interaction is unimportant, nor is there an apparent large effect of gender. Now look at comparing different levels of anxiety; should not ‘change’ models at this point, so just average over gender (LSMeans).

LSMeans Must use LSMeans to adjust all means to the same “average level” of gender.

Comparisons Mild group has significantly lower toxin levels than the moderate and severe groups

Confidence Intervals Could get CI’s for means and/or differences if you wanted them. They will be of different widths – why? It will be harder to detect differences for groups with fewer observations.

Questions?

Upcoming in Topic 13... Random Effects (parts of chapters 17 & 19 that were previously skipped)