Download presentation

Presentation is loading. Please wait.

Published byEzekiel Blush Modified over 2 years ago

1
Chapter 4 Inference About Process Quality Motivation Estimation – point estimation – interval estimation Hypothesis Testing – Definition – Testing on means known and Unknown variance – Testing on Variance

2
The need of “Statistical Inference” In statistical quality control, the probability distribution is used to model some quality characteristic (which is related to process parameters). The parameters of a probability distribution are unknown. ?? –Estimation of Process Parameters –Point Estimation / Interval Estimation The parameters of a process can be time varying, how do we identify a process change? –Hypothesis Testing

3
Observations in a sample are used to draw conclusions about the population 3

4
Random Samples Random Sample: –Sampling from an infinite population or finite population with replacement: A sample is selected so that the observations are independently and identically distributed. –Sampling n samples from a finite population of N items without replacement if each of the possible samples has an equal probability of being chosen Random Sample = Independently and Identically Distributed (i.i.d) 4

5
Terminology and Definition Statistic: –Any function of the sample data that does not contain unknown parameters. Estimate: a particular numerical value of an estimator, computed from sample data. (An estimate is a particular statistic) –Point estimate: a statistic that produces a single numerical value as the estimate of the unknown parameter –Interval estimate: a random interval (or called confidence interval) in which the true value of the parameter falls with some level of probability. Sampling distribution: –The probability distribution of a statistic. 5

6
Point Estimation Methods: [1] Method of Moment (MOM) Maximum Likelihood Estimation (MLE) [1] “Statistical Inference”, George Casella and Roger L. Berger, 2 nd edition 6

7
Interval Estimation Estimate the interval between two statistics that include the true value of the parameter with some probability –Example: Pr{ L U}=1- (0 1) –The interval L U is called a 100(1- )% confidence interval (C.I.) for the unknown mean –Two side C.I. (L is lower confidence limit, U is upper confidence limit) –Single side C.I.: lower 100(1- )% C. I.: L , Pr{ L }=1- upper 100(1- )% C. I.: U, Pr{ U}=1- Analysis procedures: –get the samples –compute the statistic –determine the statistic reference distribution –select confidence level –find the lower and/or upper confidence limits based on the reference distribution x U L /2 Real mean of the population 7

8
L U Q: how to determine the width?

9
Interval Estimation If x is a random variable with unknown mean and known variance 2, what is the estimation interval for mean ? –Select a statistic –The approximate distribution of is regardless of the distribution of x due to the central limit theorem. –Given confidence level , then 100(1- )% two-side confidence interval on is: 100(1- )% upper confidence interval on is: 100(1- )% lower confidence interval on is: where 9

10
Example: The strength of a disposable plastic beverage container is being investigated. The strengths are normally distributed, with a known standard deviation of 15 psi. A sample of 20 plastic containers has a mean strength of 246 psi. Compute a 95% confidence interval for the process mean. 10

11
11

12
Example: A chemical process converts lead to gold. However, the production varies due to the powers of the alchemist. It is known that the process is normally distributed, with a standard deviation of 2.5 g. How many samples must be taken to be 90% certain that an estimate of the mean process is within 1.5 g of the true but unknown mean yield? 12

13
Interval Estimation of the Binomial Distribution Parameter with A Larger Sample Size 13

14
Hypothesis Testing Statistical hypothesis: –a statement about the values of the parameters of a probability distribution Hypothesis testing: –Making a hypothesis concerning what we believe to be true and then use sampled data to test it. Two Hypotheses (Two Competing Propositions) –Null Hypothesis H 0 : will be rejected if the sample data do not support it. –Alternative Hypothesis H 1 : a hypothesis different from the null hypothesis Conclusion –By Comparing the Test Statistic with Critical Value, determine whether reject or NOT reject the null hypothesis. 14

15
Hypothesis Testing Procedures 1) State the null and alternative hypothesis, and define the test statistic. 2) Specify the significance level . 3) Find the distribution of the test statistic and the rejection region of H 0. 4) Collect data and calculate the test statistic. 5) Compare the test statistic with the rejection region. 6) Assess the risk. 15

16
Inference on the MEAN of a Normal Population – Variance Known Test Statistic: function of data and hypothetic value Critical value 16

17
Example: The response time of a distributed computer system is an important quality characteristic. The system manager wants to know whether the mean response time to a specific type of command exceeds 75 millisec. From past experience, he knows that the standard deviation of response time is 8 millisec. If the command is executed 25 times and the response time for each trial is recorded. The sample average response time is millisec. Formulate an appropriate hypothesis and test the hypothesis. Find out the (lower/upper?) bound of the 95% C.I. 17

18
Inference on the MEAN of a Normal Population – Variance Unknown H1H1 18

19
19

20
20

21
Example: The mean time it takes a crew to restart an aluminum rolling mill after a failure is of interest. The crew was observed over 25 occasions, and the results were mean = minutes and variance S 2 =12.28 minutes. If repair time is normally distributed, Find a 95% confidence interval on the true but unknown mean repair time. Test the hypothesis that the true mean repair time is 30 minutes. 21

22
Example 6-1: The life of a battery used in a cardiac pacemaker is assumed to be normally distributed. A random sample of 10 batteries is subjected to an accelerated life test by running them continuously at an elevated temperature until failure, and the following lives are obtained. Construct a 90% two-sided confidence interval on mean life in the accelerated test. Test the hypothesis, with =0.1 that the mean battery life is 26.5 h. 22

23
23

24
The Use of P-Values in Hypothesis Testing 1. Traditional hypothesis testing: –Given to determine whether the null hypothesis was rejected –Disadvantage: No information on how close to/far away from the rejection region in a probability sense predefined may not reflect different decision maker’s risk assessments 2. P-Value approach –P-Value: the smallest level of significance that would lead to rejection of the null hypothesis –if the predefined >P= min, reject the null hypothesis Underlying idea: “if H 0 is really true, is it possible for test statistic to be such big/small?”

25
Two-sided p _value

26
Use of P-Value for the Normal Distribution H 0 : = 0, standard normal statistic Z 0 ~N(0,1) –P=2[1- (|Z 0 |)] with two-sided H1, i.e., H 1 : 0 –P=1- (Z 0 ) for one-sided H1, H 1 : > 0 –P= (Z 0 ) for one-sided H1, H 1 : < 0 f(x)f(x) x =0 Z 0 >0 Z 0 <0 1- (Z 0 ) (Z0)(Z0) A small p-value is evidence against the null hypothesis while a large p-value means little or no evidence against the null hypothesis If p-value is small, it is less likely that the test statistic is small. So, H 0 is NOT true.

27
Inference on the MEAN of a Normal Population – Variance Known H1H1 p_value 27

28
Example: The response time of a distributed computer system is an important quality characteristic. The system manager wants to know whether the mean response time to a specific type of command exceeds 75 millisec. From past experience, he knows that the standard deviation of response time is 8 millisec. If the command is executed 25 times and the response time for each trial is recorded. The sample average response time is millisec. Formulate an appropriate hypothesis and test the hypothesis. Calculate the p-value of the true mean response time is as low as 75 millisec. 28

29
Inference on the MEAN of a Normal Population – Variance Unknown H1H1 p_value CDF 29

30
Example: The mean time it takes a crew to restart an aluminum rolling mill after a failure is of interest. The crew was observed over 25 occasions, and the results were mean = minutes and variance S 2 =12.28 minutes. If repair time is normally distributed, find a 95% confidence interval on the true but unknown mean repair time. Test the H0: μ = 30 v.s. μ ≠ 30. Calculate the p-value of the hypothesis that μ =

31
Example: The life of a battery used in a cardiac pacemaker is assumed to be normally distributed. A random sample of 10 batteries is subjected to an accelerated life test by running them continuously at an elevated temperature until failure, and the following lives are obtained. Construct a 90% two-sided confidence interval on mean life in the accelerated test. Test the H0: μ = 26.5 v.s. μ ≠ Calculate the p-value of the hypothesis that μ =

32
Confidence Interval v.s. Hypothesis Testing If the value of the parameter specified by the null hypothesis is contained in the 100(1- )% interval, then the null hypothesis cannot be rejected at the level. If the value specified by the null hypothesis is not in the interval, then the null hypothesis will be rejected at the level 32

33
Understanding the result of Hypothesis Test When we reject the null hypothesis, it is a strong conclusion: there is a strong evidence that the null hypothesis is false. When we fail to reject the null hypothesis, it is a weak conclusion: It does not mean that the null hypothesis is correct. It only means we do not have strong evidence to reject it. 33

34
Court System and Hypothesis Testing Hypothesis testing in science is a lot like the criminal court system in the United States. How do we decide guilt? Assume innocence until “proven” guilty. Evidence is presented at a trial. Proof has to be “beyond a reasonable doubt.” A jury's possible decision: guilty not guilty Note that a jury cannot declare somebody ``innocent,'' just ``not guilty.'' This is an important point. 34

35
Interrelationships between statistical inferences Statistical Distribution Confidence Interval Hypothesis Testing p _Value

36
Interrelationships between statistical inferences P=2[1- (|Z 0 |)] with two-sided H1, i.e., H1: 0

37
Inference on the Difference in Means of Two Populations – Variance Known H1H1 p_value Observations in TWO samples are all i.i.d. 37

38
38

39
; H1H1 Inference on the Difference in Means of Two Populations – Variance Unknown p_value i.i.d. Equal variance 39

40
Inference on the Difference in Means of Two Populations – Variance Unknown H1H1 40

41
41

42
42

43
Text Book Page P

44
Review P-value: a probability value With the same α value, C.I., Hypothesis testing and p-value give the same inferential conclusion. Inferences on mean of two populations: what are the statistics used? What are the reference distributions? How to define the reject region?

45
Inference on the Variance of a Normal Distribution H1H1 C.I. 45

46
46

47
Text P

48
Inference on the Variances of Two Normal Distributions The two d.f. are exchanged C.I. 48

49
49 Technician 1Technician

50
50

51
Text P

52
Testing on Binomial Parameters To test whether the parameter p of a binomial distribution equals a standard value p 0 The test is based on the normal approximation to the binomial distribution Or using the central limit theorem H 0 is rejected if Example 4.5,4.6 on p122 52

53
Test on Poisson Distribution A random sample of n observation is taken, say x 1, x 2,..,x n. Each {x i } is Poisson distributed with parameter. Then the sum x= x 1 + x x n is Poisson distributed with parameter n. If n is large, =x/n is approximately normal with mean and variance /n Test hypothesis H 0 : = 0 H 1 : 0 The null hypothesis would be rejected if |Z 0 |>Z /2. 53

54
Two Types of Hypothesis Test Errors Type I error ( producer’s risk): – = P{type I error} = P{reject H 0 |H 0 is true} =P{conclude bad | although actually good} Type II error (consumer’s risk): – = P{type II error} = P{fail to reject H 0 |H 0 is false} =P{conclude good | although actually bad} Power of the test: –Power = 1- = P{reject H 0 |H 0 is false} 11 f(x)f(x) x 00 UCL LCL /2 H0: = 0 H1: = 1 0 with known 2 54

55
Summary of Type I and Type II Errors 55

56
Properties of Type I & Type II Errors n Both types of errors can be reduced by increasing the sample size at the price of increased inspection costs. n For a given sample size, one risk can only be reduced at the expense of increasing the other risk. 56

57
The Probability of Type II Error — Detection of a mean shift with a known Type II error= =Pr{H 0 |H 1 |}=Pr{within the control limits| mean shift} H 0 : = 0 H 1 : = 1 0 with known 2 57

58
OC curve with =0.05 The larger the mean shift, the smaller the type II error The larger the sample size, the smaller the type II error 58

59
OC Curves OC curve see Fig. 4.7 P126 The larger the mean shift, the smaller the type II error The larger the sample size, the smaller the type II error =

60
OC Curves OC curve see Fig. 4.7 P126 The larger the mean shift, the smaller the type II error The larger the sample size, the smaller the type II error =

61
Example: Suppose we wish to test the hypotheses H 0 : =15 H 1 : 15 where we know that 2 =9.0. If the true mean is really 20, what sample size must be used to ensure that the probability of type II error is no greater than 0.10? Assume that =

62
Use OC Curve n=4 62

63
Example 7-4: The mean contents of coffee cans filled on a particular production line are being studied. Standards specify that the mean contents must be 16.0 oz, and from past experience it is known that the standard deviation of the can contents is 0.1 oz. The hypotheses are H 0 : =16.0 H 1 : 16.0 A random sample of nine cans is to be used, and the type I error probability is specified as =0.05. What is the type II error if the true mean contents are 1 =16.1 oz? 63

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google