Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution

Slides:



Advertisements
Similar presentations
Chapter 7 Sampling and Sampling Distributions
Advertisements

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 6 Sampling and Sampling Distributions
Business and Economics 9th Edition
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 10 Two-Sample Tests
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 10 Hypothesis Testing:
Chapter 10 Two-Sample Tests
Chapter 7 Sampling and Sampling Distributions
© 2002 Prentice-Hall, Inc.Chap 8-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 8 Two Sample Tests with Numerical Data.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
Chapter Goals After completing this chapter, you should be able to:
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
1/45 Chapter 11 Hypothesis Testing II EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
A Decision-Making Approach
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 11 Hypothesis Tests and Estimation for Population Variances
Chapter 7 Estimation: Single Population
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Two Sample Tests Statistics for Managers Using Microsoft.
Basic Business Statistics (9th Edition)
Chapter 9 Hypothesis Testing.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 6 Sampling and Sampling.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests Basic Business Statistics 10 th Edition.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Chapter 9 Hypothesis Testing: Single Population
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 11-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Pengujian Hipotesis Varians By. Nurvita Arumsari, Ssi, MSi.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th & 7 th Lesson Hypothesis Testing for Two Population Parameters.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 10-1 Chapter 2c Two-Sample Tests.
10-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 10 Two-Sample Tests Statistics for Managers using Microsoft Excel 6 th.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Industrial Statistics 2
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Hypothesis Tests for One and Two Population Variances.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 10 Hypothesis Testing:
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Two-Sample Tests Statistics for Managers Using Microsoft.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Chap 10-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 10 Hypothesis Tests for.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Comparing Sample Means
Copyright © 2016, 2013, 2010 Pearson Education, Inc. Chapter 10, Slide 1 Two-Sample Tests and One-Way ANOVA Chapter 10.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
10-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 10 Two-Sample Tests Statistics for Managers using Microsoft Excel 6 th.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Chapter 6 Sampling and Sampling Distributions
Chapter 9 Estimation: Additional Topics
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 11 Hypothesis Testing II
Chapter 10 Two Sample Tests
Estimation & Hypothesis Testing for Two Population Parameters
Chapter 11 Hypothesis Testing II
John Loucks St. Edward’s University . SLIDES . BY.
Chapter 10 Two-Sample Tests.
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Chapter 9 Hypothesis Testing.
Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution
Chapter 11 Inferences About Population Variances
Chapter 11 Hypothesis Tests and Estimation for Population Variances
Chapter 8 Estimation: Additional Topics
Chapter 10 Hypothesis Tests for One and Two Population Variances
Data Mining 2018/2019 Fall MIS 331 Chapter 7-A Sampliing Distribution,
Chapter 10 Two-Sample Tests
Chapter 6 Confidence Intervals.
Chapter 9 Estimation: Additional Topics
Presentation transcript:

Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution Confidence Interval Estimation Hypothesis Testing for Variance of a Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Outline Sampling Distributio of Sample Variances Confidence Interval Estimation for the Variance Tests of the Variance of a Normal Distribution Tests of Equality of Two Variances

Sampling Distributions of Sample Variances 6.4 Sampling Distributions Sampling Distributions of Sample Means Sampling Distributions of Sample Proportions Sampling Distributions of Sample Variances Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Sample Variance Let x1, x2, . . . , xn be a random sample from a population. The sample variance is the square root of the sample variance is called the sample standard deviation the sample variance is different for different random samples from the same population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Sampling Distribution of Sample Variances The sampling distribution of s2 has mean σ2 If the population distribution is normal, then Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Chi-Square Distribution of Sample and Population Variances If the population distribution is normal then has a chi-square (2 ) distribution with n – 1 degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

The Chi-square Distribution The chi-square distribution is a family of distributions, depending on degrees of freedom: d.f. = n – 1 Text Appendix Table 7 contains chi-square probabilities 2 2 2 0 4 8 12 16 20 24 28 0 4 8 12 16 20 24 28 0 4 8 12 16 20 24 28 d.f. = 1 d.f. = 5 d.f. = 15 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Expected value of a chi-square distribution with degree of freedom v is v E[2v] = v Variance of achi-square distribution with degree of freedom v is 2v Var[2v] = 2v

Since (n-1)s2/2 has a chi-square distribution with df: n-1 E[(n-1)s2/2] = n-1 (n-1)/2E[s2] = n-1 E[s2] = 2, Similarly Var[(n-1)s2/2] = 2(n-1) (n-1)2/4)Var[s2] = 2(n-1) Var[s2] = 24/(n-1)

Degrees of Freedom (df) Idea: Number of observations that are free to vary after sample mean has been calculated Example: Suppose the mean of 3 numbers is 8.0 Let X1 = 7 Let X2 = 8 What is X3? If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2 (2 values can be any numbers, but the third is not free to vary for a given mean) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Table 7 in Appandix d.f. versus probabilities for critical values P(210 < KL) = 0.05 KL = 3.940 hence P(210 < 3.940) = 0.05 P(210 > KU) = 0.05 KU = 18.31 hence P(210 > 18.31) = 0.05

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chi-square Example A commercial freezer must hold a selected temperature with little variation. Specifications call for a standard deviation of no more than 4 degrees (a variance of 16 degrees2). A sample of 14 freezers is to be tested What is the upper limit (K) for the sample variance such that the probability of exceeding this limit, given that the population standard deviation is 4, is less than 0.05? Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Finding the Chi-square Value Is chi-square distributed with (n – 1) = 13 degrees of freedom Use the the chi-square distribution with area 0.05 in the upper tail: 213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.) probability α = .05 2 213 = 22.36 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chi-square Example (continued) 213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.) So: or (where n = 14) so If s2 from the sample of size n = 14 is greater than 27.52, there is strong evidence to suggest the population variance exceeds 16. Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

Confidence Interval Estimation for the Variance 7.5 Confidence Intervals Population Mean Population Proportion Population Variance (From a normally distributed population) σ2 Known σ2 Unknown Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-15

Confidence Intervals for the Population Variance Goal: Form a confidence interval for the population variance, σ2 The confidence interval is based on the sample variance, s2 Assumed: the population is normally distributed Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-16

Confidence Intervals for the Population Variance (continued) The random variable follows a chi-square distribution with (n – 1) degrees of freedom Where the chi-square value denotes the number for which Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-17

P(2n-1 > 2n-1,/2 ) = /2 P(2n-1 > 2n-1,1-/2 ) = 1 - /2 or P(2n-1 < 2n-1,1-/2 ) = /2 Finally, P(2n-1,1-/2 < 2n-1 < 2n-1,/2) = 1 - /2 - /2 =1- 

two numbers such that probability that chi-square with d. f two numbers such that probability that chi-square with d.f. 6 is llaying between tham is 0.90 P(26,0.950 < 26 < 26,0.05) =0.90 The two numbers 26,0.950 = 1.635 26,0.05 = 12.932 hence P(1.635 < 26 < 12.935) =0.90

Confidence Intervals for the Population Variance (continued) The 100(1 - )% confidence interval for the population variance is given by Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-20

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Example You are testing the speed of a batch of computer processors. You collect the following data (in Mhz): Sample size 17 Sample mean 3004 Sample std dev 74 Assume the population is normal. Determine the 95% confidence interval for σx2 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-21

Finding the Chi-square Values n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom  = 0.05, so use the the chi-square values with area 0.025 in each tail: probability α/2 = .025 probability α/2 = .025 216 216 = 6.91 216 = 28.85 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-22

Calculating the Confidence Limits The 95% confidence interval is Converting to standard deviation, we are 95% confident that the population standard deviation of CPU speed is between 55.1 and 112.6 Mhz Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-23

Tests of the Variance of a Normal Distribution 9.6 Goal: Test hypotheses about the population variance, σ2 (e.g., H0: σ2 = σ02) If the population is normally distributed, has a chi-square distribution with (n – 1) degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-24

Tests of the Variance of a Normal Distribution (continued) The test statistic for hypothesis tests about one population variance is Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-25

Decision Rules: Variance Population variance Lower-tail test: H0: σ2  σ02 H1: σ2 < σ02 Upper-tail test: H0: σ2 ≤ σ02 H1: σ2 > σ02 Two-tail test: H0: σ2 = σ02 H1: σ2 ≠ σ02 a a a/2 a/2 Reject H0 if Reject H0 if Reject H0 if or Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-26

Newbold 9.47 Test the hypothesis H0:2 <=100 againts H1 2 >100 a) s2 = 165, n=25 b) s2 = 165, n=29 c) s2 = 159, n=25 d) s2 = 67, n=38

Solution

Solution

Newbold 7.48 new safety device random sample for 8 days 618 660 638 625 571 598 639 582 management concenrs about variability test the null hypothesis variance less than 500 at a significance level of 10%

Solution

Tests of Equality of Two Variances 10.4 Tests of Equality of Two Variances Tests for Two Population Variances Goal: Test hypotheses about two population variances H0: σx2  σy2 H1: σx2 < σy2 Lower-tail test F test statistic H0: σx2 ≤ σy2 H1: σx2 > σy2 Upper-tail test H0: σx2 = σy2 H1: σx2 ≠ σy2 Two-tail test The two populations are assumed to be independent and normally distributed Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-32

Hypothesis Tests for Two Variances (continued) The random variable Tests for Two Population Variances F test statistic Has an F distribution with (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Denote an F value with 1 numerator and 2 denominator degrees of freedom by Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-33

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Test Statistic Tests for Two Population Variances The critical value for a hypothesis test about two population variances is F test statistic where F has (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-34

Decision Rules: Two Variances Use sx2 to denote the larger variance. H0: σx2 = σy2 H1: σx2 ≠ σy2 H0: σx2 ≤ σy2 H1: σx2 > σy2 /2  F F Do not reject H0 Reject H0 Do not reject H0 Reject H0 rejection region for a two-tail test is: where sx2 is the larger of the two sample variances Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-35

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Example: F Test You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQ Number 21 25 Mean 3.27 2.53 Std dev 1.30 1.16 Is there a difference in the variances between the NYSE & NASDAQ at the  = 0.10 level? Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-36

F Test: Example Solution Form the hypothesis test: H0: σx2 = σy2 (there is no difference between variances) H1: σx2 ≠ σy2 (there is a difference between variances) Find the F critical values for  = .10/2: Degrees of Freedom: Numerator (NYSE has the larger standard deviation): nx – 1 = 21 – 1 = 20 d.f. Denominator: ny – 1 = 25 – 1 = 24 d.f. Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-37

F Test: Example Solution (continued) The test statistic is: H0: σx2 = σy2 H1: σx2 ≠ σy2 /2 = .05 F Do not reject H0 Reject H0 F = 1.256 is not in the rejection region, so we do not reject H0 Conclusion: There is not sufficient evidence of a difference in variances at  = .10 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 10-38