Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter Nine McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Estimation and Confidence Intervals.

Similar presentations


Presentation on theme: "Chapter Nine McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Estimation and Confidence Intervals."— Presentation transcript:

1 Chapter Nine McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Estimation and Confidence Intervals

2 We cannot be sure that Point estimate is the mean. But we can calculate an interval around this estimate and assert with a certain confidence that the true population mean will lie inside it. A Confidence Interval is a range of values within which the population parameter (eg. μ ) is expected to occur at a specified level of confidence generally expressed as a percent. A Point estimate is a single value (statistic) used to estimate a population value (parameter). Eg. μ x is a point estimate of μ

3 Level of confidence Confidence Interval

4 Let us recall from Chapter 8 that … σ/√n x μ 3.(σ / √n) The best estimator of μ is X The SD of X distribution is σ/√n Any X you calculate based on a sample will have to be within 3.(σ/√n) of μ (based on the Empirical rule)

5 We also know from Chapter 8, Z = (X – μ) / (σ/√n) From Chapter 8, Sampling Error = X – μ X + Z. (σ / √n) - Z. (σ / √n) How much width around X ? If σ is not known and n >30, the SD of the sample s is used. CI for the population mean μ is: n s zX  Combining the two, Sampling Error, X – μ = Z. (σ / √n) So, if we add & subtract the above Sampling Error factor to X, we can estimate the range (called, CI ) within which μ must lie.

6 Problem (page 250) The AM Association wants info on the mean income of managers working in the retail industry. A random sample of 256 managers had a mean of $45420 with a standard deviation of $2050. What is the interval in which the population mean would lie with a 95% confidence level. Since Z for 95% is 1.96 *, the formula for CI can rewritten as: = 45420 ± 1.96 (2050 / √256) = 45420 ± 251 So, the CI is $45169 - $45671 *See next slide

7 Because, area under the curve between Z = +1.96 and – 1.96, is 95% (see Appendix D) Why use Z=1.96 for CI at 95% ? Question: What would be the value of Z for CI at 99%? Z = 2.58 ! Notice that the CI widens when confidence level is increased from 95% to 99%

8 What does the CI at a 95% level of confidence mean ? It means that 95% of the sample intervals will contain the population mean μ Try experimenting With Visual Statistics software

9 How do we increase our confidence? 1. Widen the interval (Z ) Let us say, based on past exams, I claim with 75% confidence that in the coming test, the class average ( μ ) will be between 70-80 points. If I want to raise my confidence to 95%, I can do two things: 1) widen the CI from 70-80 to 60-90 2) increase n to reduce dispersion of the distribution

10 μ X 2. Increase the sample size (n ) Larger n squishes the area (and therefore, the probabilities) into a thinner peak; so, the level of confidence will be a high percentage even with a smaller interval. SD = σ/√n

11 Use t-distribution when: n < 30 (eg. You are crash-testing expensive autos!) only s is known (ie. σ is unknown) underlying population is approximately normal t-Distribution In general, if you see n<30 in the exam problem, you must think t-distribution!

12 The Story of t-Distribution Once upon a time, there was a statistician called Gosset … When you don’t know σ, you have to use s instead. But the problem is, when n is small (n<30), s has a wide dispersion and is not a good estimator of σ Gosset created a new distribution called ‘t’ that spreads the area under the curve wider when s is small but automatically converges to normal when n increases beyond 30!

13 Compare with Chart 9-2 in text (page 255) Note:n=5 Z=1.96 t=2.776

14 Visual Statistics Demo Using Continuous Distribution module

15 t vs. Z

16 Look at it this way: Since n is small, we are not sure s would be a good estimate of σ; so, we play it safe by increasing CI for the same confidence level. Observe how the ± 1.96 (95%) in Z in stretched outward to ± 2.776 in t to keep the area under the curve same at 0.95, when sample size is only 5.

17 Practice! (problem on page 256) A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10 tires driven 50000 miles revealed a sample mean of 0.32 inch of tread remaining with a standard deviation of 0.09 inch. Construct a 95% CI for the population mean. = 0.32 ± 2.262 ( 0.09 / √10) = 0.32 ± 0.064 = 0.256 to 0.384 What is the formula to be used? What is the value of t for df=9* and CI=95% (page 498) = 2.262 What is the 95% CI? *df = (n -1)

18 Degrees of Freedom You are in a room with 10 chairs and you are sitting in one of them. The other chairs are empty. How many other chairs can you move to? Ans: 9 So in general, df = n-1

19 CI for a population proportion So far we studied variables that use a ratio scale. There we can calculate the means. Eg. Manager’s $ income & Tire wear What if we have to work with a nominal scale variable where values are categorized into one of two groups? Eg. CSUN career center reports that 75% of its graduates get a job related to their major. You cannot calculate the mean of Yes & No’s. But, you can calculate a proportion of students who said Yes.

20 Getting the job in your major can be termed as ‘success’; if the student got a job in a different field, then it is a ‘failure’. So, Binomial distribution formulas we studied in Chapter 6 can be used to describe sampling distribution of a proportion RV! Mean successes in a Binomial distribution is nπ [Ch 6; Page 167] SD for Binomial is √ nπ(1-π) [ Page 167]

21 Binomial Distribution (See Page 170) No. of heads (successes) in 10 trials of throwing a coin Mean (expected number of heads) = 5 [notice the peak at X=5 ] If X-axis is redrawn as X/10 (ie proportion of successes), the curve will squish by 10 times; and so will its SD. X/n 0.1.2.3...... 1.0

22 Estimating population proportion Here, we focus on the proportion of successes; so, we divide the number of successes, x, by the total number of trials, n. XnXn π √p(1-p)/n Note: p=x/n

23 p π π has to be within 3σ’s (Empirical rule) σ p = √p(1-p)/n CI for the population proportion π CI = p ± Z. √p(1-p)/n (Note the pattern: CI = Sample Mean ± (Confidence level) * (SD of Sample Distrbn)

24 A sample of 500 executives who own their own home revealed 175 planned to sell their homes and retire to Arizona. Develop a 98% confidence interval for the proportion of executives that plan to sell and move to Arizona.

25 A word of caution Binomial approximation works well when the following two conditions are satisfied: n.p ≥ 5 & n.(1-p) ≥ 5. Here is why: (see page 170)

26 Calculating the sample size 3 factors affect the sample size: The level of confidence desired The margin of error the researcher will tolerate. The variability in the population being studied.

27 where n is the size of the sample E is the allowable error z is the z- value corresponding to the selected level of confidence (for 99%, from Appendix, Z=2.58) s the sample deviation of the pilot survey The formula for estimated sample size is:

28 Z = X – μ / ( s/√n ) X - μ = Z. ( s/√n ) E = Z. ( s/√n ) E 2 = Z 2. s 2 / n n = Z 2.s 2 /E 2 n = Z.s E 2 P(r)oof ! [Ch 8; Page 235]

29 A utility company would like to estimate the mean monthly electricity charge for a single family house within $5 using a 99% level of confidence. The standard deviation is estimated to be $20.00. How large a sample is required?

30 The formula for determining the sample size in the case of a proportion is p is the estimated proportion, based on past experience or a pilot survey z is the z value associated with the degree of confidence selected E is the maximum allowable error the researcher will tolerate where Study the example worked out in Page 267 [You can derive this by rearranging Formula 9-6 in page 262]

31 Finite population Correction If the population is finite (ie, a known number), multiply the SD by the following term. N, population size n, sample size nN N   1 When n is small, the value of the factor is close to 1. As n gets larger, the value of the correction factor, gets smaller; the logic is that if the sample is a substantial percentage of the population, the estimate of SD is more precise (Table 9-1,p.264) Rule of thumb: Ignore correction factor if n/N < 0.05


Download ppt "Chapter Nine McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Estimation and Confidence Intervals."

Similar presentations


Ads by Google