 ## Presentation on theme: "Inferences About Process Quality"— Presentation transcript:

Chapter 3 Inferences About Process Quality Introduction to Statistical Quality Control, 4th Edition

3-1. Statistics and Sampling Distributions
Statistical methods are used to make decisions about a process Is the process out of control? Is the process average you were given the true value? What is the true process variability? Introduction to Statistical Quality Control, 4th Edition

3-1. Statistics and Sampling Distributions
Statistics are quantities calculated from a random sample taken from a population of interest. The probability distribution of a statistic is called a sampling distribution. Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
Let X represent measurements taken from a normal distribution. X Select a sample of size n, at random, and calculate the sample mean, Then Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
Probability Example The life of an automotive battery is normally distributed with mean 900 days and standard deviation 35 days. What is the probability that a random sample of 25 batteries will have an average life of more than 1000 days? Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
Chi-square (2) Distribution If x1, x2, …, xn are normally and independently distributed random variables with mean zero and variance one, then the random variable is distributed as chi-square with n degrees of freedom Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
Chi-square (2) Distribution Furthermore, the sampling distribution of is chi-square with n – 1 degrees of freedom when sampling from a normal population. Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
Chi-square (2) Distribution for various degrees of freedom. Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
t-distribution If x is a standard normal random variable and if y is a chi-square random variable with k degrees of freedom, then is distributed as t with k degrees of freedom. Introduction to Statistical Quality Control, 4th Edition

3-1.1 Sampling from a Normal Distribution
F-distribution If w and y are two independent chi-square random variables with u and v degrees of freedom, respectively, then is distributed as F with u numerator degrees of freedom and v denominator degrees of freedom. Introduction to Statistical Quality Control, 4th Edition

3-1.2 Sampling from a Bernoulli Distribution
A random variable, x, with probability function is called a Bernoulli random variable. The sum of a sample from a Bernoulli process has a binomial distribution with parameters n and p. Introduction to Statistical Quality Control, 4th Edition

3-1.2 Sampling from a Bernoulli Distribution
x1, x2, …, xn taken from a Bernoulli process The sample mean is a discrete random variable given by The mean and variance of are Introduction to Statistical Quality Control, 4th Edition

3-1.3 Sampling from a Poisson Distribution
Consider a random sample of size n, x1, x2, …, xn, taken from a Poisson process with parameter  The sum, x = x1 + x2 + … + xn is also Poisson with parameter n. The sample mean is a discrete random variable given by The mean and variance of are Introduction to Statistical Quality Control, 4th Edition

3-2. Point Estimation of Process Parameters
Parameters are values representing the population. Ex) The population mean and variance, respectively. Parameters in reality are often unknown and must be estimated. Statistics are estimates of parameters. Ex) The sample mean and sample variance, respectively. Introduction to Statistical Quality Control, 4th Edition

3-2. Point Estimation of Process Parameters
Two properties of good point estimators The point estimator should be unbiased. The point estimator should have minimum variance. Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Two categories of statistical inference: Parameter Estimation Hypothesis Testing Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
A statistical hypothesis is a statement about the values of the parameters of a probability distribution. Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Steps in Hypothesis Testing Identify the parameter of interest State the null hypothesis, H0 and alternative hypotheses, H1. Choose a significance level State the appropriate test statistic State the rejection region Compare the value of test statistic to the rejection region. Can the null hypothesis be rejected? Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Example: An automobile manufacturer claims a particular automobile can average 35 mpg (highway). Suppose we are interested in testing this claim. We will sample 25 of these particular autos and under identical conditions calculate the average mpg for this sample. Before actually collecting the data, we decide that if we get a sample average less than 33 mpg or more than 37 mpg, we will reject the makers claim. (Critical Values) Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Example (continued) H0: H1: From the sample of 25 cars, the average mpg was found to be What is your conclusion? Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Choice of Critical Values How are the critical values chosen? Wouldn’t it be easier to decide “how much room for error you will allow” instead of finding the exact critical values for every problem you encounter? OR Wouldn’t be easier to set the size of the rejection region, rather than setting the critical values for every problem? Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Significance Level The level of significance,  determines the size of the rejection region. The level of significance is a probability. It is also known as the probability of a “Type I error” (want this to be small) Type I error - rejecting the null hypothesis when it is true. How small? Usually want Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Types of Error Type I error - rejecting the null hypothesis when it is true. Pr(Type I error) = . Sometimes called the producer’s risk. Type II error - not rejecting the null hypothesis when it is false. Pr(Type II error) = . Sometimes called the consumer’s risk. Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
An Engine Explodes H0: An automobile engine explodes when started. H1: An automobile engine does not explode when started. Which error would you take action to avoid? Whose risk is higher, the producer’s or the consumer’s? Introduction to Statistical Quality Control, 4th Edition

3-3. Statistical Inference for a Single Sample
Power of a Test The Power of a test of hypothesis is given by 1 -  That is, 1 -  is the probability of correctly rejecting the null hypothesis, or the probability of rejecting the null hypothesis when the alternative is true. Introduction to Statistical Quality Control, 4th Edition

3-3.1 Inference on the Mean of a Population, Variance Known
Hypothesis Testing Hypotheses: H0: H1: Test Statistic: Significance Level,  Rejection Region: If Z0 falls into either of the two regions above, reject H0 Introduction to Statistical Quality Control, 4th Edition

3-3.1 Inference on the Mean of a Population, Variance Known
Example 3-1 Hypotheses: H0: H1: Test Statistic: Significance Level,  = 0.05 Rejection Region: Since 3.50 > 1.645, reject H0 and conclude that the lot mean pressure strength exceeds 175 psi. Introduction to Statistical Quality Control, 4th Edition

3-3.1 Inference on the Mean of a Population, Variance Known
Confidence Intervals A general 100(1- )% two-sided confidence interval on the true population mean,  is 100(1- )% One-sided confidence intervals are: Upper Lower Introduction to Statistical Quality Control, 4th Edition

3-3.1 Inference on the Mean of a Population, Variance Known
Confidence Interval on the Mean with Variance Known Two-Sided: See the text for one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition

3-3.1 Inference on the Mean of a Population, Variance Known
Example 3-2 Reconsider Example Suppose a 95% two-sided confidence interval is specified. Using Equation (3-28) we compute Our estimate of the mean bursting strength is 182 psi  3.92 psi with 95% confidence Introduction to Statistical Quality Control, 4th Edition

3-3.2 The Use of P-Values in Hypothesis Testing
If it is not enough to know if your test statistic, Z0 falls into a rejection region, then a measure of just how significant your test statistic is can be computed - P-value. P-values are probabilities associated with the test statistic, Z0. Introduction to Statistical Quality Control, 4th Edition

3-3.2 The Use of P-Values in Hypothesis Testing
Definition The P-value is the smallest level of significance that would lead to rejection of the null hypothesis H0. Introduction to Statistical Quality Control, 4th Edition

3-3.2 The Use of P-Values in Hypothesis Testing
Example Reconsider Example 3-1. The test statistic was calculated to be Z0 = 3.50 for a right-tailed hypothesis test. The P-value for this problem is then P = 1 - (3.50) = Thus, H0:  = 175 would be rejected at any level of significance   P = Introduction to Statistical Quality Control, 4th Edition

3-3.3 Inference on the Mean of a Population, Variance Unknown
Hypothesis Testing Hypotheses: H0: H1: Test Statistic: Significance Level,  Rejection Region: Reject H0 if Introduction to Statistical Quality Control, 4th Edition

3-3.3 Inference on the Mean of a Population, Variance Unknown
Confidence Interval on the Mean with Variance Unknown Two-Sided: See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition

3-3.3 Inference on the Mean of a Population, Variance Unknown
Computer Output Introduction to Statistical Quality Control, 4th Edition

3-3.4 Inference on the Variance of a Normal Distribution
Hypothesis Testing Hypotheses: H0: H1: Test Statistic: Significance Level,  Rejection Region: Introduction to Statistical Quality Control, 4th Edition

3-3.4 Inference on the Variance of a Normal Distribution
Confidence Interval on the Variance Two-Sided: See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition

3-3.5 Inference on a Population Proportion
Hypothesis Testing Hypotheses: H0: p = p0 H1: p  p0 Test Statistic: Significance Level,  Rejection Region: Introduction to Statistical Quality Control, 4th Edition

3-3.5 Inference on a Population Proportion
Confidence Interval on the Population Proportion Two-Sided: See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition

3-3.6 The Probability of Type II Error
Calculation of P(Type II Error) Assume the test of interest is H0: H1: P(Type II Error) is found to be The Power of the test is then 1 -  Introduction to Statistical Quality Control, 4th Edition

3-3.6 The Probability of Type II Error
Operating Characteristic (OC) Curves Operating Characteristic (OC) curve is a graph representing the relationship between , ,  and n. OC curves are useful in determining how large a sample is required to detect a specified difference with a particular probability. Introduction to Statistical Quality Control, 4th Edition

3-3.6 The Probability of Type II Error
Operating Characteristic (OC) Curves Introduction to Statistical Quality Control, 4th Edition

3-3.7 Probability Plotting
Probability plotting is a graphical method for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data. Probability plotting uses special graph paper known as probability paper. Probability paper is available for the normal, lognormal, and Weibull distributions among others. Can also use the computer. Introduction to Statistical Quality Control, 4th Edition

3-3.7 Probability Plotting
Example 3-8 Introduction to Statistical Quality Control, 4th Edition

3-4. Statistical Inference for Two Samples
Previous section presented hypothesis testing and confidence intervals for a single population parameter. Results are extended to the case of two independent populations Statistical inference on the difference in population means, Introduction to Statistical Quality Control, 4th Edition

3-4.1 Inference For a Difference in Means, Variances Known
Assumptions X11, X12, …, X1n1 is a random sample from population 1. X21, X22, …, X2n2 is a random sample from population 2. The two populations represented by X1 and X2 are independent Both populations are normal, or if they are not normal, the conditions of the central limit theorem apply Introduction to Statistical Quality Control, 4th Edition

3-4.1 Inference For a Difference in Means, Variances Known
Point estimator for is where Introduction to Statistical Quality Control, 4th Edition

3-4.1 Inference For a Difference in Means, Variances Known
Hypothesis Tests for a Difference in Means, Variances Known Null Hypothesis: Test Statistic: Introduction to Statistical Quality Control, 4th Edition

3-4.1 Inference For a Difference in Means, Variances Known
Hypothesis Tests for a Difference in Means, Variances Known Alternative Hypotheses Rejection Criterion Introduction to Statistical Quality Control, 4th Edition

3-4.1 Inference For a Difference in Means, Variances Known
Confidence Interval on a Difference in Means, Variances Known 100(1 - )% confidence interval on the difference in means is given by Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Case I: Point estimator for is where Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Case I: The pooled estimate of , denoted by is defined by Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Case I: Null Hypothesis: Test Statistic: Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Variances Unknown Alternative Hypotheses Rejection Criterion Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Case II: Null Hypothesis: Test Statistic: Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Hypothesis Tests for a Difference in Means, Case II: The degrees of freedom for are given by Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Confidence Intervals on a Difference in Means, Case I: 100(1 - )% confidence interval on the difference in means is given by Introduction to Statistical Quality Control, 4th Edition

3-4.2 Inference For a Difference in Means, Variances Unknown
Confidence Intervals on a Difference in Means, Case II: 100(1 - )% confidence interval on the difference in means is given by Introduction to Statistical Quality Control, 4th Edition

Introduction to Statistical Quality Control, 4th Edition
3-4.2 Paired Data Observations in an experiment are often paired to prevent extraneous factors from inflating the estimate of the variance. Difference is obtained on each pair of observations, dj = x1j – x2j, where j = 1, 2, …, n. Test the hypothesis that the mean of the difference, d, is zero. Introduction to Statistical Quality Control, 4th Edition

Introduction to Statistical Quality Control, 4th Edition
3-4.2 Paired Data The differences, dj, represent the “new” set of data with the summary statistics: Introduction to Statistical Quality Control, 4th Edition

Introduction to Statistical Quality Control, 4th Edition
3-4.2 Paired Data Hypothesis Testing Hypotheses: H0: d = 0 H1: d  0 Test Statistic: Significance Level,  Rejection Region: |t0|  t/2,n-1 Introduction to Statistical Quality Control, 4th Edition

3-4.3 Inferences on the Variances of Two Normal Distributions
Hypothesis Testing Consider testing the hypothesis that the variances of two independent normal distributions are equal. Assume random samples of sizes n1 and n2 are taken from populations 1 and 2, respectively Introduction to Statistical Quality Control, 4th Edition

3-4.3 Inferences on the Variances of Two Normal Distributions
Hypothesis Testing Hypotheses: Test Statistic: Significance Level,  Rejection Region: Introduction to Statistical Quality Control, 4th Edition

3-4.3 Inferences on the Variances of Two Normal Distributions
Alternative Test Rejection Hypothesis Statistic Region Introduction to Statistical Quality Control, 4th Edition

3-4.3 Inferences on the Variances of Two Normal Distributions
Confidence Intervals on Ratio of the Variances of Two Normal Distributions 100(1 - )% two-sided confidence interval on the ratio of variances is given by Introduction to Statistical Quality Control, 4th Edition

3-4.4 Inference on Two Population Proportions
Large-Sample Hypothesis Testing Hypotheses: H0: p1 = p2 H1: p1  p2 Test Statistic: Significance Level,  Rejection Region: Introduction to Statistical Quality Control, 4th Edition

3-4.4 Inference on Two Population Proportions
Alternative Hypothesis Rejection Region Introduction to Statistical Quality Control, 4th Edition

3-4.4 Inference on Two Population Proportions
Confidence Interval on the Difference in Two Population Proportions Two-Sided: See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Example Investigating the effect of one factor (with several levels) on some response. See Table 3-5 Hardwood Observations Concentration Totals Avg 5% Overall Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Analysis of Variance Always a good practice to compare the levels of the factor using graphical methods such as boxplots. Comparative boxplots show the variability of the observations within a factor level and the variability between factor levels. Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Figure 3-14 (a) Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The observations yij can be modeled by a = number of factor levels n = number of replicates (# of observations per treatment (factor) level.) Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The hypotheses being tested are H0 : H1 : for at least one i Total variability can be measured by the “total corrected sum of squares”: Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The sum of squares identity is Notationally, this is often written as SST = SSTreatments + SSE Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The expected value of the treatment sum of squares is If the null hypothesis is true, then Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The error mean square If the null hypothesis is true, the ratio has an F-distribution with a – 1 and a(n – 1) degrees of freedom. Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
The following formulas can be used to calculate the sums of squares. Total Sum of Squares (SST): Sum of Squares for the Treatments (SSTreatment): Sum of Squares for error (SSE): SSE = SST -SSTreatment Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Analysis of Variance Table 3-7 Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Analysis of Variance Table 3-8 Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Residual Analysis Assumptions: model errors are normally and independently distributed with equal variance. Check the assumptions by looking at residual plots. Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Residual Analysis Plot of residuals versus factor levels Introduction to Statistical Quality Control, 4th Edition

3-5. What If We Have More Than Two Populations?
Residual Analysis Normal probability plot of residuals Introduction to Statistical Quality Control, 4th Edition