Week 10 Comparing Two Means or Proportions. Generalising from sample IndividualsMeasurementGroupsQuestion Children aged 10 Mark in maths test Boys & girls.

Slides:



Advertisements
Similar presentations
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Confidence Intervals Chapter 12.
Advertisements

Section 9.3 Inferences About Two Means (Independent)
Stat 100, This week Chapter 20, Try Problems 1-9 Read Chapters 3 and 4 (Wednesday’s lecture)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Significance Tests Chapter 13.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Estimating Means with Confidence
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 9 Comparing Means
Inference for regression - Simple linear regression
Hypothesis Testing – Examples and Case Studies
Inference for Distributions
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. The Role of Confidence Intervals in Research Chapter 21.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
Lesson Comparing Two Means.
More About Significance Tests
Week 9 Testing Hypotheses. Philosophy of Hypothesis Testing Model Data Null hypothesis, H 0 (and alternative, H A ) Test statistic, T p-value = prob(T.
Significance Tests in practice Chapter Tests about a population mean  When we don’t know the population standard deviation σ, we perform a one.
Comparing Two Population Means
Agresti/Franklin Statistics, 1 of 111 Chapter 9 Comparing Two Groups Learn …. How to Compare Two Groups On a Categorical or Quantitative Outcome Using.
Significance Tests: THE BASICS Could it happen by chance alone?
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Tests About a Population Proportion
Pengujian Hipotesis Dua Populasi By. Nurvita Arumsari, Ssi, MSi.
Week 8 Confidence Intervals for Means and Proportions.
The Practice of Statistics Third Edition Chapter 13: Comparing Two Population Parameters Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Chapter 9 Inferences from Two Samples 9.2 Inferences About Two Proportions 9.3 Inferences About Two Means (Independent) 9.4 Inferences About Two Means.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
AP Statistics Chapter 24 Comparing Means.
MTH3003 PJJ SEM II 2014/2015 F2F II 12/4/2015.  ASSIGNMENT :25% Assignment 1 (10%) Assignment 2 (15%)  Mid exam :30% Part A (Objective) Part B (Subjective)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Anova and contingency tables
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Some studies have shown that lean and obese people spend their time differently. Obese people spend fewer minutes per day standing and walking than do.
AP Statistics.  If our data comes from a simple random sample (SRS) and the sample size is sufficiently large, then we know that the sampling distribution.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
UNIT 3 YOUR FINAL EXAMINATION STUDY MATERIAL STARTS FROM HERE Copyright ©2011 Brooks/Cole, Cengage Learning 1.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Difference Between Two Means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Statistics 22 Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they.
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables UNIT V 1.
More About Confidence Intervals
FINAL EXAMINATION STUDY MATERIAL PART I
Estimation & Hypothesis Testing for Two Population Parameters
Estimating Means With Confidence
Simulation-Based Approach for Comparing Two Means
Confidence Intervals.
Comparing Two Means: Paired Data
Lesson Comparing Two Means.
Comparing Two Populations
Comparing Two Means: Paired Data
Section 10.2 Comparing Two Means.
Presentation transcript:

Week 10 Comparing Two Means or Proportions

Generalising from sample IndividualsMeasurementGroupsQuestion Children aged 10 Mark in maths test Boys & girls Are male marks higher on average? Plots in fieldYield of wheatVarieties A & B Which gives higher yields? Cars leaving production line CO emissions from exhaust Production lines 1 & 2 Are both lines same?

Generalising from sample IndividualsMeasurementGroupsQuestion Children aged 10 Pass/fail in maths test Boys & girlsAre males more likely to pass? Cabbages in field Infected by cabbage butterfly Varieties A & B Which is less likely to be infected? Cars leaving production line Rattle in exhaust Production lines 1 & 2 Do both lines have same chance of rattle?

Numerical measurements: means Difference in average weight loss for those who diet compared to those who exercise to lose weight? Difference is there between the mean foot lengths of men and women? Population parameter  2 –  1 = difference between population means Sample estimate x 2 – x 1 = difference between sample means

Categorical measurements: propns Difference between the proportions that would quit smoking if taking the antidepressant buproprion (Zyban) versus wearing a nicotine patch? Difference between proportion who have heart disease of men who snore and men who don’t snore? Population parameter  2 –  1 = difference between population proportions Sample estimate p 2 – p 1 = difference between sample proportions

Requirement: independent samples Random samples taken separately from 2 populations Randomised experiment with 2 treatments One random sample, but a categorical variable splits individuals into 2 groups. Two samples are called independent samples when the measurements in one sample are not related to the measurements in the other sample.

Model for numerical data Sample 1 ~ population (mean  1, s.d.  1 ) Sample 2 ~ population (mean  2, s.d.  2 ) Estimation: estimate (  2 –  1 ) with Standard error? Confidence interval? Testing: is (  2 –  1 ) zero? p-value

Model for categorical data Sample 1 ~ population (proportion  1 ) Sample 2 ~ population (proportion  2 ) Estimation: estimate (  2 –  1 ) with (p 2 – p 1 ) Standard error? Confidence interval? Testing: is (  2 –  1 ) zero? p-value

Distribution of difference In both cases, we need to find distribution of difference (p 2 – p 1 ) or Independent samples >> difference of independent random variables. We already know distns of the two parts — what is distn of their difference?

Sum of 2 variables Sample mean: Sample total: Same distns Different distns

Difference between 2 variables Same standard devn as sum If X 1 and X 2 are normal Remember that X 1 and X 2 must be independent

Example Husband height ~ normal(1.85, 0.1) Wife height ~ normal(1.7, 0.08) Assume independent. (Probably not!!) Prob that wife is taller than husband? (Husband - Wife) ~

Example Husband height ~ normal(1.85, 0.1) Wife height ~ normal(1.7, 0.08) Husband - Wife ~ normal(0.15, ) P (diff ≤ 0) = area Prob = 0.297

Difference between proportions If X 1 and X 2 are independent, If p 1 and p 2 are independent, For large samples, p 1 and p 2 are approx normal, so their difference is too.

n 1 = n 2 = 244 randomly assigned to each treatment Std error for difference in propns Nicotine patches vs Antidepressant (Zyban)? Zyban: 85 out of 244 quit smoking Patch: 52 out of 244 quit smoking So,

Approximate 95% C.I. Best you can do for difference between proportions For means, CI can be improved by replacing ‘2’ by a different value. For sufficiently large samples, the interval Estimate  2  Standard error is an approximate 95% C.I.

Patch vs Antidepressant Approx 95% C.I..135  2(.040) =>.135 .080 =>.055 to.215 Study: n 1 = n 2 = 244 randomly assigned to each group Zyban:85 of the 244 Zyban users quit smoking =.348 Patch: 52 of the 244 patch users quit smoking =.213 So, We are 95% confident that Zyban gives an improvement of between 5.5% and 21.5% of the probability of quitting smoking.

Difference between means If X 1 and X 2 are independent, If both populations are normal, so is the difference.

n 1 = 42 men on diet, n 2 = 27 men on exercise routine Std error for difference in means Lose More Weight by Diet or Exercise? Diet: Lost an average of 7.2 kg with std dev of 3.7 kg Exercise: Lost an average of 4.0 kg with std dev of 3.9 kg So,

We are 95% confident that those who diet lose on average 1.58 to 4.82 kg more than those who exercised. Approximate 95% Confidence Interval: 3.2  2(.81) => 3.2  1.62 => 1.58 to 4.82 kg Study: n 1 = 42 men on diet, n 2 = 27 men exercise Diet: Lost an average of 7.2 kg with std dev of 3.7 kg Exercise: Lost an average of 4.0 kg with std dev of 3.9 kg So, Diet vs Exercise

A CI for the Difference Between Two Means (Independent Samples): where t* is a value from t-tables. Better C.I. for mean d.f. = min(n 1 –1, n 2 –1) Welch’s approx gives a different d.f. (higher) but is a complicated formula t* is approx 1.96 if d.f. is high

Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection. Estimate difference between the mean crossing times. No Stare Group (n = 14): 8.3, 5.5, 6.0, 8.1, 8.8, 7.5, 7.8, 7.1, 5.7, 6.5, 4.7, 6.9, 5.2, 4.7 Stare Group (n = 13): 5.6, 5.0, 5.7, 6.3, 6.5, 5.8, 4.5, 6.1, 4.8, 4.9, 4.5, 7.2, 5.8 Effect of a stare on driving

 No outliers; no strong skewness.  Crossing times in stare group seem faster & less variable. Checking data

A 95% CI for  2 –  1 is Effect of stare on driving Using df = min(n 1 –1, n 2 –1) = 12, gives t* = 2.179

 Slightly narrower C.I. that we got with d.f. = 12. N.B.C.I. is based on df = 21 (Welch’s approx) Effect of stare on driving Minitab

Interpretation We are 95% confident that it takes drivers between 0.17 and 1.91 seconds less on average to cross intersection if someone stares at them. A 95% CI for  2 –  1 is 0.17 to 1.91 sec

Testing two proportions Hypotheses H 0 :  1 –  2 = 0 H A :  1 –  2 ≠ 0 or  1 –  2 < 0 or  1 –  2 > 0 Watch how Population 1 and 2 are defined. Data requirements Independent samples n 1 p 1, n 1 (1-p 1 ), n 2 p 2, n 2 (1-p 2 ) all at least 5, preferably ≥10

Test statistic Based on p 1 – p 2 Standardise:

Test statistic If H 0 is true, best estimate of  is So we use test statistic If H 0 is true, this has standard normal distn p-value from normal distn

Prevention of Ear Infections Does the use of sweetener xylitol reduce the incidence of ear infections? Randomized Experiment: Of 165 children on placebo, 68 got ear infection. Of 159 children on xylitol, 46 got ear infection. Hypotheses: H 0 :  1 –  2 =  H a :  1 –  2 >  Data check: At least 5 success & failure in each group

Prevention of Ear Infections Overall propn getting infection Test statistic p-value = 0.01 Conclusion: Strong evidence xylitol reduces chance of ear infection

Testing two means Hypotheses H 0 :  1 –  2 = 0 H A :  1 –  2 ≠ 0 or  1 –  2 < 0 or  1 –  2 > 0 Watch how Population 1 and 2 are defined. Data requirements Fairly large n 1 and n 2 (say 30 or more), or Not much skewness & no outliers (normal model reasonable)

Test statistic Based on Standardise:

Test Test statistic: If H 0 is true, this has approx t-distn with d.f. = min(n 1 –1, n 2 –1) Same d.f. as CI for  1 –  2 p-value from t distn Minitab or Excel n 1 and n 2 ≥ 30 Use normal tables

Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection. Test whether stare speeds up crossing times. No Stare Group (n = 14): 8.3, 5.5, 6.0, 8.1, 8.8, 7.5, 7.8, 7.1, 5.7, 6.5, 4.7, 6.9, 5.2, 4.7 Stare Group (n = 13): 5.6, 5.0, 5.7, 6.3, 6.5, 5.8, 4.5, 6.1, 4.8, 4.9, 4.5, 7.2, 5.8 Effect of a stare on driving

 Small sample sizes, but  No outliers; no strong skewness. Checking data

Effect of stare on driving Hypotheses H 0 :  1 –  2 = 0 H A :  1 –  2 > 0 where 1 = no-stare, 2 = stare

Effect of stare on driving Test statistic df = min(n 1 –1, n 2 –1) = 12 Upper tail area of t-distn (12 d.f.) p = P-value Strong evidence that stare speeds up crossing

 Very similar p-value and same conclusion N.B.Test is based on df = 21 (Welch’s approx) Effect of stare on driving Minitab Strong evidence that stare speeds up crossing

Paired data and 2-sample data Make sure you distinguish between: 2 measurements on each individual (e.g. before & after) Measurements from 2 independent groups Different cars assessed for insurance claims in garages A and B Same cars assessed by both garages 2 independent samples Paired data