Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.

Slides:



Advertisements
Similar presentations
Chapter 7 Hypothesis Testing
Advertisements

Regression and correlation methods
ANALYZING MORE GENERAL SITUATIONS UNIT 3. Unit Overview  In the first unit we explored tests of significance, confidence intervals, generalization, and.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Decision Errors and Power
Paired Data: One Quantitative Variable Chapter 7.
A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true
Significance Testing Chapter 13 Victor Katch Kinesiology.
The Simple Regression Model
Stat 512 – Day 8 Tests of Significance (Ch. 6). Last Time Use random sampling to eliminate sampling errors Use caution to reduce nonsampling errors Use.
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics Lecture 20 – April 3, 2008.
5-3 Inference on the Means of Two Populations, Variances Unknown
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Review Tests of Significance. Single Proportion.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Chapter 13: Inference in Regression
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
QNT 531 Advanced Problems in Statistics and Research Methods
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
More About Significance Tests
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Comparing Two Proportions
+ Chapter 12: Inference for Regression Inference for Linear Regression.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
CHAPTER 9 Testing a Claim
Large sample CI for μ Small sample CI for μ Large sample CI for p
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Essential Statistics Chapter 141 Thinking about Inference.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
The z test statistic & two-sided tests Section
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Chapter 20 Testing Hypothesis about proportions
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Welcome to MM570 Psychological Statistics
AP Statistics Section 11.1 B More on Significance Tests.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
A 10 Step Quest Every Time!. Overview When we hypothesis test, we are trying to investigate whether or not a sample provides strong evidence of something.
Chapter 13 Understanding research results: statistical inference.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
10.2 Comparing Two Means Objectives SWBAT: DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample means. DETERMINE.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
9.3 Hypothesis Tests for Population Proportions
CHAPTER 9 Testing a Claim
P-values.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Simulation-Based Approach for Comparing Two Means
Hypothesis Testing: Hypotheses
Week 11 Chapter 17. Testing Hypotheses about Proportions
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Chapter 10 Analyzing the Association Between Categorical Variables
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

Chapter 9 Comparing More than Two Means

Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling a die, or some computer simulation.  We then found where our sample proportion was in this null distribution.

Simulation-Based Tests  Comparing two proportions: Assuming there was no was no association between explanatory and response variables (the difference in proportions is zero), we shuffled cards and dealt them into two piles. (This essentially scrambled the response variable.) We then calculated the difference in proportions many times and built a null distribution. We finally found where the difference in our original sample proportions was located in the null distribution.

Simulation-Based Tests  Comparing two means: Assuming there was no relationship between explanatory and response variables (so the difference in means should be zero), we scrambled the response variable and calculated the difference in means many times and built a null distribution. We the found where the difference in our original two sample means was located in the null distribution.

Simulation-Based Tests  Paired Test: Assuming there was no relationship between the explanatory and response variables (so the mean difference should be zero), we randomly switched some of the pairs and calculated the mean of the differences many times and built a null distribution. We then found where the original mean of the differences from the sample was located in the null distribution.

Simulation-Based Tests  Comparing more than two proportions: Assuming there was no was no association between explanatory and response variables (all the proportions are the same), we scrambled the response variable and calculated the MAD statistic (or χ 2 statistic) many times and built a null distribution. We finally found where the original MAD or χ 2 statistic from our sample was located in the null distribution.

Two more types of tests  We now want to compare multiple means (more than two).  In chapter 10 we will look at an association between two quantitative variables using correlation and regression.  Both of these processes are basically the same as most of the simulation-based tests we have already done. Just the data types and the statistic we use is different.

Follow up tests  In the last chapter, we tested multiple proportions and if we found significance, we followed this up with by calculating confidence intervals to find out exactly which proportions were different.  Why didn’t we just start out finding a number of confidence intervals?  Let’s go through the following example to answer this and introduce tests for multiple means.

Section 9.1. Comparing Multiple Means: Simulation-Based Approach  Suppose we wanted to compare how much various energy drinks increased people’s pulses.  We would end up with a number of means. (Caffiene amounts shown are mg per 12 oz.)

Controlling for Type I Error  We could do this with multiple tests where we compared two means at a time, but If we were comparing 3 means, we would have to use 3 two-sample tests to compare these three means. (A vs B, B vs C, and A vs C)  If each test has a 5% significance level, there’s a 5% chance making a Type I Error. (We can call this a false alarm. Rejecting the null when it is true. There really is no difference between our groups and we got a result out in the tail just by chance alone.)

Controlling for Type I Error  These type I errors “accumulate” when we do more tests on the same data. At the 5% significance level, the probability of making at least one type I error for three test would be 14%. Comparing 4 means (6 tests), this jumps to 26%. Comparing 5 means (10 tests), this jumps to 40%.  An alternative approach uses one over-all test that compares all means at once.

Overall Test  We used one overall test in the last chapter when we compared proportions and we will do the same for comparing means.  If I have two means to compare, we just need to look at their difference to measure how far apart they are.  Suppose we wanted to compare three means. How could I create something that would measure how different all three means are?

A measure to compare 3 means  We will use the same MAD statistic as before, but this time look at the mean absolute differences for averages.  MAD = (|avg1 – avg2|+|avg2 – avg3|+ |avg3 – avg1|)/3  Let’s try this on an example!

Comprehension Example  Students were read an ambiguous prose passage under one of the following conditions: Students were given a picture that could help them interpret the passage before they heard it. Students were given the picture after they heard the passage. Students were not shown any picture before or after hearing the passage.  They were then tested on their comprehension of the passage.

Comprehension Example  This experiment is a partial replication done here at Hope of a study done by Bransford and Johnson (1972).  The students were randomly assigned to one of the three groups.  They listened to the passage with either a picture before, a picture after, or neither.

Hypotheses  Null: In the population there is no association between whether or when a picture was shown and comprehension of the passage  Alternative: In the population there is an association between whether and when a picture was shown and comprehension of the passage

Hypotheses  Null: All three of the long term mean comprehension scores are the same. µ no picture = µ picture before = µ picture after  Alternative: At least one of the mean comprehension scores is different.

Results Means

Finding the measure MAD = (|3.21−4.95|+|3.21−3.37|+|4.95−3.37|)/3 = ( )/3 = 3.48/3 =  What is the likelihood of this happening by chance if there were really no difference in comprehension between the three groups?  What types of values (e.g., large, small, positive, negative) of this statistic will give evidence against the null hypothesis?

Let’s test this  Get the data from the website.  Go to the Analyzing Quantitative Response Applet and paste in the data.  Run the test. This applet is very similar to the one we used in the previous chapter.

Conclusion  Since we have a small p-value we can conclude at least one of the mean comprehension scores is different.  Can we tell which one or ones?  Go back to dotplots and take a look.  We can do pairwise confidence intervals to find which means are significantly different than the other means and will do that in the next section.

Expl 9.1: Exercise and Brain Volume  Brain size usually shrinks as we age and such shrinkage may be linked to dementia.  Can we do something to protect against this shrinkage?  A study done in China randomly assigned elderly volunteers to one four groups: tai chi, walking, social interaction, none.  Percentage of brain size increase or decrease was calculated after the study.

Section 9.2 Theory-Based Approach to Compare Multiple Means (ANalysis Of Variance ANOVA)

ANOVA  Like in chapter 8 when we compared multiple proportions, we need a statistic other than the MAD to make the transition to theory- based a smooth one.  This new statistic is called an F statistic and the theory-based distribution that estimates our null distribution is called an F distribution.  Unlike the MAD statistic, the F statistic takes into account the variability within each group.

F test statistic  The analysis of variance F test statistic is:

F test statistic  Remember measures of variation are always non- negative. (Our measure of variation can be zero when all values in the data set are the same.)  So our F statistic is also non-negative

F test statistic  Remember our ambiguous prose example? The researchers also had the students take a recall test several hours later to measure how well they could recall the content of the passage.

The difference in means matters and so does the individual group’s variation.  Original recall data on the left, hypothetical recall data on the right. Variation between groups is the same, variation within groups is different. How will this affect the F test statistic?

Hypotheses  Null: All three of the long-run mean recall scores for students under the different conditions are the same. (No association)  Alternative: At least one of the long-run mean recall scores for students under the different conditions is different. (Association)

Theory-Based ANOVA test  Just as with the simulation-based method, we are assuming we have independent groups.  Two extra conditions must be met to use traditional ANOVA: Normality: If sample sizes are small within each group, data shouldn’t be very skewed. If it is, use simulation approach Equal variation: Standard deviations of each group should be within a factor of 2 of each other

F test statistic  Are these conditions met for our recall data?

Let’s run the test  Let’s get the Recall data and run the test using the same applet we used last time (Analyzing Quantitative Response)  Let’s do simulation using the MAD statistic as well as the F statistic.  Then do theory-based methods using ANOVA.  If we get a small p-value, we will follow this overall test up with confidence intervals to determine exactly where the difference occurs.

Conclusion  Since we have a small p-value we have strong evidence against the null and can conclude at least one of the long-run mean recall scores is different.  From our confidence intervals,  After - Before: (-4.05, -1.74)*  After - None: (-2.42, -0.11)*  Before - None: (0.4756, )*  We can see that each is significant so µ picture after ≠ µ picture before µ pictureafter ≠ µ no picture µ picture before ≠ µ no picture

Strength of Evidence  As sample size increases, strength of evidence increases.  As the means move farther apart, strength of evidence increases. (This is the variability between groups.)  As the standard deviations increase, strength of evidence decreases. (This is the variability within groups.)

Exploration Exploration 9.2: Comparing Popular Diets