 # 1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter.

## Presentation on theme: "1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter."— Presentation transcript:

1 Chapter 12 Inference About a Population

2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter we utilize the approach developed before to describe a population. –Identify the parameter to be estimated or tested. –Specify the parameter’s estimator and its sampling distribution. –Construct a confidence interval estimator or perform a hypothesis test.

3 We shall develop techniques to estimate and test three population parameters. –The expected value  –The variance  2 –The population proportion p (for qualitative data) Introduction

4 Recall: By the central limit theorem, when  2 is known is normally distributed if: the sample is drawn from a normal population, or the population is not normal but the sample is sufficiently large. When  2 is unknown, another random variable describes the distribution of 12.1 Inference About a Population Mean When the Population Standard Deviation is Unknown

5 The t - Statistic Z s When the sampled population is normally distributed, the statistic t is Student t distributed. See next. When   is unknown, we use s 2 instead, and the Z statistic is then replaced by the t-statistic t

6 The t - Statistic s 0 The Student- t distribution is mound-shaped, and symmetrical around zero. Degrees of freedom = n 2 Degrees of freedom= n 1 n 1 < n 2 t Using the t-table The degrees of freedom determine the distribution shape

7 Testing  when  is unknown Example 12.1 - Productivity of newly hired Trainees

8 Example 12.1 –In order to determine the number of workers required to meet demand, the productivity of newly hired trainees is studied. –It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring. –Can we conclude that this belief is correct, based on productivity observation of 50 trainees (raw data is presented later in the file Xm12-01). Testing  when  is unknown

9 Example 12.1 – Solution –The problem objective is to describe the population of the number of packages processed in one hour. –The data is quantitative. H 0 :  = 450 H 1 :  > 450 –The t statistic d.f. = n - 1 = 49 We want to prove that the trainees reach 90% productivity of experienced workers We want to prove that the trainees reach 90% productivity of experienced workers Testing  when  is unknown

10 After transforming into a t-statistic we express the rejection region in terms of the statistic t. Solution - continued Observe: H 1 has the form of  >  0, thus The rejection region is Testing  when  is unknown t  t ,n-1

11 Solution continued (solving by hand) The rejection region is t > t ,n – 1. t ,n - 1 = t.05,49 Testing  when  is unknown The critical value (table entry)  t. 05,50 = 1.676 You can use the Excel function =tinv to obtain the critical value. This function gives the two-tail probability ‘t value’. That is, for a two tail test with significance level of alpha, it returns the critical value of t ,n – 1. Since our test is one-tail, we’ll use 2  instead of . Thus, type in =tinv(.1,49), to obtain the result 1.676551. 2(.05) =.1

12 Since 1.89 > 1.676 we reject the null hypothesis in favor of the alternative. Conclusion: There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at.05 significance level. 1.676 1.89 Rejection region Testing  when  is unknown The test statistic is calculated based on the data provided in Xm12-01

13 Testing  when  is unknown.05.0323 Xm12-01.xls Using Data Analysis Plus and the p-value approach to test the mean. t-Test: Mean Packages Mean460.38 Standard Deviation38.8271 Hypothesized Mean450 df49 t Stat1.8904 P(T<=t) one-tail0.0323 t Critical one-tail1.6766 Since.02323 <.05, we reject the null hypothesis in favor of the alternative. There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at.05 significance level. 1.89

14 Estimating  when  is unknown Confidence interval estimator of  when s 2 is unknown

15 Example 12.2 –An investor is trying to estimate the return on investment in companies that won quality awards last year. –A random sample of 83 such companies is selected, and the return on investment is calculated had he invested in them. –Construct a 95% confidence interval for the mean return. Estimating  when  is unknown

16 Solution (solving by hand) –The problem objective is to describe the population of annual returns from buying shares of quality award-winners. –The data is quantitative. –Solving by hand From the data we determinedata t.025,82  t.025,80 Estimating  when  is unknown

17 Estimating  when  is unknown t - estimate: Mean Returns Mean15.0172 Standard Deviation 8.3054 LCL13.0237 UCL16.8307 Using Data Analysis Plus

18 Checking the required conditions We need to check that the population is normally distributed, or at least not extremely non-normal. There are statistical methods that can be used to test for normality (to be introduced later in the book, but not discussed here). From the sample histograms we see…

19 A Histogram for XM-11- 01 Packages A Histogram for XM-11- 02 Returns

20 12.2 Inference About a Population Variance Some times we are interested in making inference about the variability of processes. Examples: –The consistency of a production process for quality control purposes. –To evaluate the risk associated with different investments. To draw inference about variability, the parameter of interest is  2.

21 The population variance can be estimated or its value tested using the sample variance s 2. The sample variance s 2 is an unbiased, consistent and efficient point estimator for  2. The inference about  2 is made by using a sample statistic that incorporates s 2 and  2. Inference About a Population Variance

22 This statistic is. It has a distribution called Chi-squared, if the population is normally distributed. Inference About a Population Variance

23 Inference About a Population Variance The Chi-squared distribution The degfrees of freedom (df) determines the distribution shape

24 Example 1 (operation management application) –A container-filling machine is believed to fill 1 liter containers so consistently, that the variance of the filling will be less than 1 cc (.001 liter). –To test this belief a random sample of 25 1-liter fills was taken, and the results recorded ( Xm12-03.xls) –Do these data support the belief that the variance is less than 1cc at 5% significance level? Testing the population variance – Left hand tail test

25 Solution –The problem objective is to describe the population of 1-liter fills from a filling machine. –The data are quantitative, and we are interested in the variability of the fills. –The two hypotheses are: H 0 :  2 = 1 H 1 :  2 <1 We want to prove that the process is consistent Testing the population variance s 2  Critical Value The rejection region has the form:

26 Testing the population variance Solution –The two hypotheses are: H 0 :  2 = 1 H 1 :  2 <1 The rejection region in terms of  2 is:

27 Solving by hand –Note that (n - 1)s 2 =  (x i - x) 2 =  x i 2 – (  x i ) 2 /n –From the sample (data is presented in units of cc-1000 to avoid rounding) we can calculate  x i = 24,996.4, and  x i 2 = 24,992,821.3data –Then (n - 1)s 2 = 24,992,821.3-(24,996.4) 2 /25 =20.78 Testing the population variance

28 There is insufficient evidence to reject the hypothesis that the variance is equal to 1cc. There is insufficient evidence to reject the hypothesis that the variance is equal to 1cc. Testing the population variance Using the  2 table Rejection Region 20.78 13.84 Since 20.78>13.8484 do not reject the null hypothesis.8484.13,78.20 1 78.20s)1n( 2 125,95. 2 1n,1 22 2 2      

29 A right hand tail test: H 0 :  2 = value H 1 :  2 > value Rejection region Testing the population variance – Right hand tail test; Two tail test; Click

30 A right hand tail test: –H 0 :  2 = value H 1 :  2 > value –Rejection region A two tail test –H 0 :  2  value H 1 :  2  value –Rejection region: Testing the population variance – Right hand tail test; Two tail test;

31 Estimating the population variance From the following probability statement P(  2 1-  /2 <  2 <  2  /2 ) = 1-  we have (by substituting  2 = [(n - 1)s 2 ]/  2.) This is the confidence interval for  2 with 1-  % confidence level.

32 Estimating the population variance Example 2 –Estimate the variance of fills in example 12.3 with 99% confidence. Solution –We have (n-1)s 2 = 20.78. From the Chi-squared table we have  2  / 2,n-1 =  2. 005, 24 = 45.5585  2  / 2,n-1 =  2.0995, 24 = 9.88623

33 The confidence interval is Estimating the population variance

34 12.4 Inference About a Population Proportion When the population consists of nominal or categorical data, the only inference we can make is about the proportion of occurrence of a certain value. The parameter “p” was used before to calculate these proportion under the binomial distribution.

35 Statistic and sampling distribution –the statistic used when making inference about ‘p’ is: – Under certain conditions, [np > 5 and n(1-p) > 5], is approximately normally distributed, with  = p and  2 = p(1 - p)/n. 12.4 Inference About a Population Proportion

36 Testing and estimating the Proportion Test statistic for p Interval estimator for p (1-  confidence level)

37 Example 12.5 (Predicting the winner in election day) –Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day. –Based on the data presented in Xm12.5.xls (where 1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote?Xm12.5.xls Testing the Proportion

38 Solution –The problem objective is to describe the population of votes in the state. –The parameter to be tested is ‘p’. –Success is defined as “Republican vote”. –The hypotheses are: H 0 : p =.5 H 1 : p >.5 More than 50% vote republican Testing the Proportion

39 –Solving by hand The rejection region is z > z  = z.05 = 1.645. From file Xm12.5.xls we count 407 success. Number of voters participating is 765.Xm12.5.xls The sample proportion is The value of the test statistic is The p-value is = P(Z>1.77) =.0382 Testing the Proportion

40 There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican. Using Data Analysis Plus we have: < 0.05 Testing the Proportion

41 Example (marketing application) –In a survey of 2000 TV viewers at 11.40 p.m. on a certain night, 226 indicated they watched “The Tonight Show”. –Estimate the number of TVs tuned to the Tonight Show in a typical night, if there are 100 million potential television sets. Use 95% confidence level. –Solution Estimating the Proportion 226/2000 =.113 1-.113 =.887

42 Solution Estimating the Proportion Using Excel we have: LCL =.0991(1,000,000)= 9.9 million UCL =.1269(1,000,000)=12.7 million A confidence interval for the number of viewers who watched the tonight Show:

43 Selecting the Sample Size to Estimate the Proportion Recall: The confidence interval for the proportion is Thus, to estimate the proportion to within W, we can write

44 Selecting the Sample Size to Estimate the Proportion The required sample size is

45 Example –Suppose we want to estimate the proportion of customers who prefer our company’s brand to within.03 with 95% confidence. –Find the sample size needed to guarantee that this requirement is met. –Solution W =.03; 1 -  =.95, therefore  /2 =.025, so z.025 = 1.96 Since the sample has not yet been taken, the sample proportion is still unknown. We proceed using either one of the following two methods: Sample Size to Estimate the Proportion

46 Method 1: –There is no knowledge about the value of Let. This results in the largest possible n needed for a 1-  confidence interval of the form. If the sample proportion does not equal.5, the actual W will be narrower than.03 with the n obtained by the formula below. Sample Size to Estimate the Proportion Method 2: –There is some idea about the value of Use the value of to calculate the sample size

Download ppt "1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter."

Similar presentations