Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution

Similar presentations


Presentation on theme: "Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution"— Presentation transcript:

1 Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution
Confidence Interval Estimation Hypothesis Testing for Variance of a Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

2 Outline Sampling Distributio of Sample Variances
Confidence Interval Estimation for the Variance Tests of the Variance of a Normal Distribution Tests of Equality of Two Variances

3 Sampling Distributions of Sample Variances
6.4 Sampling Distributions Sampling Distributions of Sample Means Sampling Distributions of Sample Proportions Sampling Distributions of Sample Variances Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

4 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Sample Variance Let x1, x2, , xn be a random sample from a population. The sample variance is the square root of the sample variance is called the sample standard deviation the sample variance is different for different random samples from the same population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

5 Sampling Distribution of Sample Variances
The sampling distribution of s2 has mean σ2 If the population distribution is normal, then Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

6 Chi-Square Distribution of Sample and Population Variances
If the population distribution is normal then has a chi-square (2 ) distribution with n – 1 degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

7 The Chi-square Distribution
The chi-square distribution is a family of distributions, depending on degrees of freedom: d.f. = n – 1 Text Appendix Table 7 contains chi-square probabilities 2 2 2 d.f. = 1 d.f. = 5 d.f. = 15 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

8 Expected value of a chi-square distribution with degree of freedom v is v
E[2v] = v Variance of achi-square distribution with degree of freedom v is 2v Var[2v] = 2v

9 Since (n-1)s2/2 has a chi-square distribution with df: n-1
E[(n-1)s2/2] = n-1 (n-1)/2E[s2] = n-1 E[s2] = 2, Similarly Var[(n-1)s2/2] = 2(n-1) (n-1)2/4)Var[s2] = 2(n-1) Var[s2] = 24/(n-1)

10 Degrees of Freedom (df)
Idea: Number of observations that are free to vary after sample mean has been calculated Example: Suppose the mean of 3 numbers is 8.0 Let X1 = 7 Let X2 = 8 What is X3? If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2 (2 values can be any numbers, but the third is not free to vary for a given mean) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

11 Table 7 in Appandix d.f. versus probabilities for critical values P(210 < KL) = 0.05 KL = hence P(210 < 3.940) = 0.05 P(210 > KU) = 0.05 KU = hence P(210 > 18.31) = 0.05

12 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chi-square Example A commercial freezer must hold a selected temperature with little variation. Specifications call for a standard deviation of no more than 4 degrees (a variance of 16 degrees2). A sample of 14 freezers is to be tested What is the upper limit (K) for the sample variance such that the probability of exceeding this limit, given that the population standard deviation is 4, is less than 0.05? Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

13 Finding the Chi-square Value
Is chi-square distributed with (n – 1) = 13 degrees of freedom Use the the chi-square distribution with area 0.05 in the upper tail: 213 = (α = .05 and 14 – 1 = 13 d.f.) probability α = .05 2 213 = 22.36 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

14 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chi-square Example (continued) 213 = (α = .05 and 14 – 1 = 13 d.f.) So: or (where n = 14) so If s2 from the sample of size n = 14 is greater than 27.52, there is strong evidence to suggest the population variance exceeds 16. Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall

15 Confidence Interval Estimation for the Variance
7.5 Confidence Intervals Population Mean Population Proportion Population Variance (From a normally distributed population) σ2 Known σ2 Unknown Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-15

16 Confidence Intervals for the Population Variance
Goal: Form a confidence interval for the population variance, σ2 The confidence interval is based on the sample variance, s2 Assumed: the population is normally distributed Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-16

17 Confidence Intervals for the Population Variance
(continued) The random variable follows a chi-square distribution with (n – 1) degrees of freedom Where the chi-square value denotes the number for which Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-17

18 P(2n-1 > 2n-1,/2 ) = /2 P(2n-1 > 2n-1,1-/2 ) = 1 - /2 or P(2n-1 < 2n-1,1-/2 ) = /2 Finally, P(2n-1,1-/2 < 2n-1 < 2n-1,/2) = 1 - /2 - /2 =1- 

19 two numbers such that probability that chi-square with d. f
two numbers such that probability that chi-square with d.f. 6 is llaying between tham is 0.90 P(26, < 26 < 26,0.05) =0.90 The two numbers 26, = 1.635 26,0.05 = hence P( < 26 < ) =0.90

20 Confidence Intervals for the Population Variance
(continued) The 100(1 - )% confidence interval for the population variance is given by Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-20

21 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Example You are testing the speed of a batch of computer processors. You collect the following data (in Mhz): Sample size 17 Sample mean 3004 Sample std dev 74 Assume the population is normal. Determine the 95% confidence interval for σx2 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-21

22 Finding the Chi-square Values
n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom  = 0.05, so use the the chi-square values with area in each tail: probability α/2 = .025 probability α/2 = .025 216 216 = 6.91 216 = 28.85 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-22

23 Calculating the Confidence Limits
The 95% confidence interval is Converting to standard deviation, we are 95% confident that the population standard deviation of CPU speed is between 55.1 and Mhz Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-23

24 Tests of the Variance of a Normal Distribution
9.6 Goal: Test hypotheses about the population variance, σ2 (e.g., H0: σ2 = σ02) If the population is normally distributed, has a chi-square distribution with (n – 1) degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-24

25 Tests of the Variance of a Normal Distribution
(continued) The test statistic for hypothesis tests about one population variance is Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-25

26 Decision Rules: Variance
Population variance Lower-tail test: H0: σ2  σ02 H1: σ2 < σ02 Upper-tail test: H0: σ2 ≤ σ02 H1: σ2 > σ02 Two-tail test: H0: σ2 = σ02 H1: σ2 ≠ σ02 a a a/2 a/2 Reject H0 if Reject H0 if Reject H0 if or Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chap 11-26

27 Newbold 9.47 Test the hypothesis H0:2 <=100 againts H1 2 >100
a) s2 = 165, n=25 b) s2 = 165, n=29 c) s2 = 159, n=25 d) s2 = 67, n=38

28 Solution

29 Solution

30 Newbold 7.48 new safety device random sample for 8 days
management concenrs about variability test the null hypothesis variance less than 500 at a significance level of 10%

31 Solution

32 Tests of Equality of Two Variances
10.4 Tests of Equality of Two Variances Tests for Two Population Variances Goal: Test hypotheses about two population variances H0: σx2  σy2 H1: σx2 < σy2 Lower-tail test F test statistic H0: σx2 ≤ σy2 H1: σx2 > σy2 Upper-tail test H0: σx2 = σy2 H1: σx2 ≠ σy2 Two-tail test The two populations are assumed to be independent and normally distributed Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

33 Hypothesis Tests for Two Variances
(continued) The random variable Tests for Two Population Variances F test statistic Has an F distribution with (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Denote an F value with 1 numerator and 2 denominator degrees of freedom by Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

34 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Test Statistic Tests for Two Population Variances The critical value for a hypothesis test about two population variances is F test statistic where F has (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

35 Decision Rules: Two Variances
Use sx2 to denote the larger variance. H0: σx2 = σy2 H1: σx2 ≠ σy2 H0: σx2 ≤ σy2 H1: σx2 > σy2 /2 F F Do not reject H0 Reject H0 Do not reject H0 Reject H0 rejection region for a two-tail test is: where sx2 is the larger of the two sample variances Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

36 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Example: F Test You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQ Number Mean Std dev Is there a difference in the variances between the NYSE & NASDAQ at the  = 0.10 level? Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

37 F Test: Example Solution
Form the hypothesis test: H0: σx2 = σy2 (there is no difference between variances) H1: σx2 ≠ σy2 (there is a difference between variances) Find the F critical values for  = .10/2: Degrees of Freedom: Numerator (NYSE has the larger standard deviation): nx – 1 = 21 – 1 = 20 d.f. Denominator: ny – 1 = 25 – 1 = 24 d.f. Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch

38 F Test: Example Solution
(continued) The test statistic is: H0: σx2 = σy2 H1: σx2 ≠ σy2 /2 = .05 F Do not reject H0 Reject H0 F = is not in the rejection region, so we do not reject H0 Conclusion: There is not sufficient evidence of a difference in variances at  = .10 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch


Download ppt "Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution"

Similar presentations


Ads by Google