Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Estimation of Means and Proportions
“Students” t-test.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Tests of Hypotheses Based on a Single Sample
Simple Linear Regression and Correlation
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
PSY 307 – Statistics for the Behavioral Sciences
Sample size computations Petter Mostad
Inference about a Mean Part II
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
The Analysis of Variance
Chapter 11: Inference for Distributions
Inferences About Process Quality
Copyright © Cengage Learning. All rights reserved. 7 Statistical Intervals Based on a Single Sample.
Chapter 9 Hypothesis Testing.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Statistical Inference for Two Samples
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Statistical inference: confidence intervals and hypothesis testing.
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 9-2 Inferences About Two Proportions.
1 Hypothesis testing can be used to determine whether Hypothesis testing can be used to determine whether a statement about the value of a population parameter.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Basic Statistics Inferences About Two Population Means.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
© Copyright McGraw-Hill 2000
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
© Copyright McGraw-Hill 2004
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Copyright © Cengage Learning. All rights reserved. 7 Statistical Intervals Based on a Single Sample.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Inference for distributions: - Comparing two means.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Section 9-3 Inferences About Two Means:
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Statistical Intervals Based on a Single Sample
3. The X and Y samples are independent of one another.
Inferential Statistics Inferences from Two Samples
Chapter 8: Inference for Proportions
Chapter 9 Hypothesis Testing.
Lecture Slides Elementary Statistics Eleventh Edition
Chapter 9 Hypothesis Testing.
Elementary Statistics
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Presentation transcript:

Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples

Copyright © Cengage Learning. All rights reserved. 9.2 The Two-Sample t Test and Confidence Interval

3 Values of the population variances will usually not be known to an investigator. In the previous section, we illustrated for large sample sizes the use of a z test and CI in which the sample variances were used in place of the population variances. In fact, for large samples, the CLT allows us to use these methods even when the two populations of interest are not normal.

4 The Two-Sample t Test and Confidence Interval In practice, though, it will often happen that at least one sample size is small and the population variances have unknown values. Without the CLT at our disposal, we proceed by making specific assumptions about the underlying population distributions. The use of inferential procedures that follow from these assumptions is then restricted to situations in which the assumptions are at least approximately satisfied.

5 The Two-Sample t Test and Confidence Interval We could, for example, assume that both population distributions are members of the Weibull family or that they are both Poisson distributions. It shouldn’t surprise you to learn that normality is typically the most reasonable assumption. Assumptions

6 The Two-Sample t Test and Confidence Interval The test statistic and confidence interval formula are based on the same standardized variable developed in Section 9.1, but the relevant distribution is now t rather than z.

7 The Two-Sample t Test and Confidence Interval Theorem Manipulating T in a probability statement to isolate  1 –  2 gives a CI, whereas a test statistic results from replacing  1 –  2 by the null value  0.

8 The Two-Sample t Test and Confidence Interval

9 Example 9.6 The void volume within a textile fabric affects comfort, flammability, and insulation properties. Permeability of a fabric refers to the accessibility of void space to the flow of a gas or liquid. The article “The Relationship Between Porosity and Air Permeability of Woven Textile Fabrics” (J. of Testing and Eval., 1997: 108–114) gave summary information on air permeability (cm 3 /cm 2 /sec) for a number of different fabric types.

10 Example 9.6 Consider the following data on two different types of plainweave fabric: cont’d

11 Example 9.6 Assuming that the porosity distributions for both types of fabric are normal, let’s calculate a confidence interval for the difference between true average porosity for the cotton fabric and that for the acetate fabric, using a 95% confidence level. Before the appropriate t critical value can be selected, df must be determined: cont’d

12 Example 9.6 Thus we use v = 9; Appendix Table A.5 gives t.025,9 = The resulting interval is With a high degree of confidence, we can say that true average porosity for triacetate fabric specimens exceeds that for cotton specimens by between and cm3/cm2/sec. cont’d

13 Pooled t Procedure

14 Pooled t Procedures Alternatives to the two-sample t procedures just described result from assuming not only that the two population distributions are normal but also that they have equal variances. That is, the two population distribution curves are assumed normal with equal spreads, the only possible difference between them being where they are centered.

15 Pooled t Procedures Let  2 denote the common population variance. Then standardizing gives which has a standard normal distribution. Before this variable can be used as a basis for making inferences about  1 –  2, the common variance must be estimated from sample data.

16 Pooled t Procedures One estimator of  2 is, the variance of the m observations in the first sample, and another is, the variance of the second sample. Intuitively, a better estimator than either individual sample variance results from combining the two sample variances. A first thought might be to use However, if m > n, then the first sample contains more information about  2 than does the second sample, and an analogous comment applies if m < n.

17 Pooled t Procedures The following weighted average of the two sample variances, called the pooled (i.e., combined) estimator of  2,adjusts for any difference between the two sample sizes: The first sample contributes m – 1 degrees of freedom to the estimate of  2, and the second sample contributes n – 1 df, for a total of m + n – 2 df.

18 Pooled t Procedures Statistical theory says that if replaces  2 in the expression for Z, the resulting standardized variable has a t distribution based on m + n – 2 df. In the same way that earlier standardized variables were used as a basis for deriving confidence intervals and test procedures, this t variable immediately leads to the pooled t CI for estimating  1 –  2 and the pooled t test for testing hypotheses about a difference between means.

19 Pooled t Procedures In the past, many statisticians recommended these pooled t procedures over the two-sample t procedures. The pooled t test, for example, can be derived from the likelihood ratio principle, whereas the two-sample t test is not a likelihood ratio test. Furthermore, the significance level for the pooled t test is exact, whereas it is only approximate for the two-sample t test.

20 Pooled t Procedures However, recent research has shown that although the pooled t test does outperform the two-sample t test by a bit (smaller  's for the same  ) when the former test can easily lead to erroneous conclusions if applied when the variances are different. Analogous comments apply to the behavior of the two confidence intervals. That is, the pooled t procedures are not robust to violations of the equal variance assumption.

21 Pooled t Procedures It has been suggested that one could carry out a preliminary test of and use a pooled t procedure if this null hypothesis is not rejected. Unfortunately, the usual “F test” of equal variances is quite sensitive to the assumption of normal population distributions—much more so than t procedures. We therefore recommend the conservative approach of using two-sample t procedures unless there is really compelling evidence for doing otherwise, particularly when the two sample sizes are different.

22 Type II Error Probabilities

23 Type II Error Probabilities Determining type II error probabilities (or equivalently, power = 1 –  ) for the two-sample t test is complicated. There does not appear to be any simple way to use the  curves of Appendix Table A.17. The most recent version of Minitab (Version 16) will calculate power for the pooled t test but not for the two- sample t test. However, the UCLA Statistics Department homepage ( permits access to a power calculator that will do this.

24 Type II Error Probabilities For example, we specified m = 10, n = 8,  1 = 300,  2 = 225 (as shown in the below table, whose sample standard deviations are somewhat smaller than these values of  1 and  2 ) and asked for the power of a two-tailed level.05 test of H 0 :  1 –  2 = 0 when  1 –  2 = 100, 250 and 500.

25 Type II Error Probabilities The resulting values of the power were.1089,.4609,and.9635 (corresponding to  =.89,.54, and.04), respectively. In general,  will decrease as the sample sizes increase, as  increases, and as  1 –  2 moves farther from 0. The software will also calculate sample size necessary to obtain specified value of power for a particular value of  1 –  2