Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011.

Slides:



Advertisements
Similar presentations
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Inference about the Difference Between the
Acceptance Sampling Plans by Variables
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Chapter 6 Continuous Random Variables and Probability Distributions
Probability Distributions
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )
Evaluating Hypotheses
8-1 Quality Improvement and Statistics Definitions of Quality Quality means fitness for use - quality of design - quality of conformance Quality is.
Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 5 Continuous Random Variables and Probability Distributions
8-3 Testing a Claim about a Proportion
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
5-3 Inference on the Means of Two Populations, Variances Unknown
Copyright © Cengage Learning. All rights reserved. 4 Continuous Random Variables and Probability Distributions.
Control Charts for Attributes
Sample Size Determination Ziad Taib March 7, 2014.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Chapter 4 Continuous Random Variables and Probability Distributions
Statistical Inference for Two Samples
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Hypothesis Testing.
Fundamentals of Hypothesis Testing: One-Sample Tests
Discrete Distributions
Introduction to Statistical Quality Control, 4th Edition Chapter 7 Process and Measurement System Capability Analysis.
Estimation Basic Concepts & Estimation of Proportions
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Chapter 5 Some Discrete Probability Distributions.
STATISTICAL INFERENCE PART VII
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Section 9.2 Testing the Mean  9.2 / 1. Testing the Mean  When  is Known Let x be the appropriate random variable. Obtain a simple random sample (of.
Monitoring Bernoulli Processes William H. Woodall Virginia Tech
V. Control Charts A. Overview Consider an injection molding process for a pen barrel. The goal of this process: To produce barrels whose true mean outside.
Mid-Term Review Final Review Statistical for Business (1)(2)
Introduction to Statistical Quality Control, 4th Edition
MS 305 Recitation 11 Output Analysis I
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
What are Nonparametric Statistics? In all of the preceding chapters we have focused on testing and estimating parameters associated with distributions.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
ENGR 610 Applied Statistics Fall Week 3 Marshall University CITE Jack Smith.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Module 23: Proportions: Confidence Intervals and Hypothesis Tests, Two Samples This module examines confidence intervals and hypothesis test for.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Statistics Sampling Intervals for a Single Sample Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers,
Discrete Random Variables. Introduction In previous lectures we established a foundation of the probability theory; we applied the probability theory.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 SMU EMIS 7364 NTU TO-570-N Control Charts Basic Concepts and Mathematical Basis Updated: 3/2/04 Statistical Quality Control Dr. Jerrell T. Stracener,
1 CHAPTER (7) Attributes Control Charts. 2 Introduction Data that can be classified into one of several categories or classifications is known as attribute.
Engineering Probability and Statistics - SE-205 -Chap 3 By S. O. Duffuaa.
Hypothesis Testing. A statistical Test is defined by 1.Choosing a statistic (called the test statistic) 2.Dividing the range of possible values for the.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Chapter Nine Hypothesis Testing.
Introduction to Hypothesis Test – Part 2
STATISTICAL INFERENCE
IEE 380 Review.
Engineering Probability and Statistics - SE-205 -Chap 3
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
CONCEPTS OF HYPOTHESIS TESTING
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Power Section 9.7.
Confidence Intervals.
Presentation transcript:

Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011

Monitoring High-yield processes EDUCATION –B.S. Catholic University of Peru –M.A. Monterrey Tech, Mexico –Ph.D. Texas A&M University RESEARCH –Quality Engineering - SPC, Process monitoring –Applied Probability and Statistics – Sequential analysis –Probability modeling – Change point detection, process surveillance

Monitoring High-yield processes MOTIVATION –High-yield processes –Monitor the fraction of nonconforming units p –Very small p(ppm) –To detect increases or decreases in p –A very sensitive procedure MONITORING HIGH-YIELD PROCESSES

Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES ASSUMPTIONS Process is observed continuously Process can be characterized by Bernoulli trials Fraction of nonconforming units p is constant, but may change at an unknown point of time 

Monitoring High-yield processes Hypothesis Testing For (level  ) two-sided tests the region R is made up of two subregions R1 and R2 with limits L and U such that P[X ≤ L] =  / 2 P[X ≥ U] =  / 2 L U

Monitoring High-yield processes Hypothesis Testing Consider testing the proportion p

Monitoring High-yield processes Hypothesis Testing The test may be based on different random variables Binomial (n, p) Geometric (p) Negative Binomial (r, p) Binomial – order k (n, p) Geometric – order k (p) Negative Binomial – order k (r, p)

Monitoring High-yield processes Binomial tests when p is very small

Monitoring High-yield processes Test 1 proportion p 0 = 0.025(25000 ppm) test H 0 : p = against H 1 : p  X n. of nonconforming units in 500 items 

Monitoring High-yield processes Test 1 Let X  Binomial (500,p) To test the hypothesis H 0 : p = against H 1 : p  the rejection region is R = {x ≤ 2}  {x ≥ 25} since P[X ≤ 2]= < =  /2 P[X ≥ 25]= < =  /2

Monitoring High-yield processes Test 1 Plot of P[rejecting H 0 ] vs. p is

Monitoring High-yield processes Hypothesis Testing Now consider testing p 0 = (100 ppm)

Monitoring High-yield processes Test 1 Let X  Binomial (n = 500,p) To test the hypothesis H 0 : p = against H 1 : p  the rejection region is R = {X ≥ 2} since P [X ≥ 2]= For n=500 there is no two-sided test for p =

Monitoring High-yield processes Test 1 Binomial (n = 500, p = 0.025)Binomial (n = 500, p = )

Monitoring High-yield processes Test 1 For this test a plot of P[rejecting H 0 ] vs. p is

Monitoring High-yield processes Consider a geometric test for p when p is very small

Monitoring High-yield processes Test 2 Let X  Geo(p) To test the hypothesis (  = ) H 0 : p = against H 1 : p  the rejection region is R = {X ≤ 13}  {X ≥ 66075} since P[X ≤ 13]= P[X ≥ 66075]= An observation in {X ≤ 13} leads to conclude that p >

Monitoring High-yield processes Test 2 For this test a plot of P[rejecting H 0 ] vs. p is

Monitoring High-yield processes Another performance measure of a sequential testing procedure

Monitoring High-yield processes Hypothesis Testing Let X 1, X 2, …  Geo(p) iid Let T number of observations until H 0 is rejected Consider the random variables for j = 1,2,… A j = 1 if X j  R P[A j = 0] = P R A j = 0 otherwise then the probability function of T is P[T= t] = P[A 1 = 0] P[A 2 = 0]… P[A t-1 = 0] P[A t = 1] = P R [1-P R ] t-1

Monitoring High-yield processes Hypothesis Testing therefore T  Geo(P R ) Let us consider E[T] = 1/P R as a performance measure then E[T] = 1/P R mean number of tests until H 0 is rejected when p = p 0 E[T] = 1/ 

Monitoring High-yield processes Test 2 Let X  Geo(p) q = 1 - p P [X ≤ x] = 1 – q x Let the rejection regionR = {X U} then P A = P [not rejecting H 0 ] = P [ L ≤ X ≤ U] = 1 – q U – (1 – q L-1 ) = q L-1 – q U P R = 1 – (1- p ) L-1 + (1 - p) U

Monitoring High-yield processes Test 2 Let X  Geo(p) To test the hypothesis (  = ) H 0 : p = against H 0 : p  the rejection region is R = {X 66074} then P[rejecting H 0 ] is P R = 1 – (1 – p) 13 + (1 – p) E[T] = 1/P R when p = p 0 E[T] = 1/  = 370.4

Monitoring High-yield processes Test 2 we want E[T]

Monitoring High-yield processes Test 2 How can we improve upon this test ? we want E[T]

Monitoring High-yield processes run sum procedure

Monitoring High-yield processes Geometric chart A sequence of tests of hypotheses

Monitoring High-yield processes THE RUN SUM Interval between limits is divided into regions A score is assigned to each region A sum is accumulated according to the region in which the statistics falls Sum is reset when last mean falls on the other side of the center line Reject H 0 when the cumulative score is equal or exceeds a limit value

Monitoring High-yield processes THE RUN SUM Interval between limits is divided into eight regions A score is assigned to each region (0,1,2,3) A sum is accumulated according to the region in which the statistics falls Sum is reset when last mean falls on the other side of the center line Reject H 0 when the cumulative score is equal or exceeds a limit value L = 5

Monitoring High-yield processes THE RUN SUM – for the mean

Monitoring High-yield processes THE GEOMETRIC RUN SUM

Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION Let us denote the following cumulative sums SU t = SU t-1 + q t if X t falls above the center line = 0 otherwise SL t = SL t-1 - q t if X t falls below the center line = 0 otherwise where q t is the score assigned to the region in which X t falls

Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION The run sum statistic is defined, for t = 1,2,…, by S t = max {SU t, -SL t } with SU 0 = 0, SL 0 = 0 and limit sum L

Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN Need to define region limits ( l 1, l 2, l 3 and l 5, l 6, l 7 ) region scores (q 1, q 2, q 3 and q 4 ) limit sum L

Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN Region limits above and below the center line are not symmetric around the center line. To define the region limits we use the cumulative probabilities of the distribution of X  Geo (p 0 ) Such probabilities were chosen to be the same as those of a run sum for the mean with the same scores

Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN

Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE If X  Geo (p 0 = ) the region limits are given by =P [X ≤ l 1 ] =P [X ≤ l 2 ] =P [X ≤ l 3 ] =P [X ≤ l 4 ] =P [X ≤ l 5 ] =P [X ≤ l 6 ] =P [X ≤ l 7 ]

Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE If X  Geo (p 0 = ) the region limits are given by =P [X ≤ 13 ] =P [X ≤ 220 ] =P [X ≤ 1701 ] =P [X ≤ 6932 ] =P [X ≤ ] =P [X ≤ ] =P [X ≤ ]

Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE Conclude H 1 : p  p 0 when S t  L Let T number of samples until H 0 is rejected What is the distribution of T ? What is the mean and standard deviation?

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Markov chain States defined by the values that S t can assume State space  = {-4,-3,-2,-1,0,1,2,3,4,C} where C ={n  N | n = …,-6,-5,5,6,…} is an absorbing state Transition probabilities

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Letp 1 =P [ X ≤ l 1 ] p 2 =P [ l 1 ≤X ≤ l 2 ] p 3 =P [ l 2 ≤X ≤ l 3 ] p 4 =P [ l 3 ≤X ≤ l 4 ] p 5 =P [ l 4 ≤X ≤ l 5 ] p 6 =P [ l 5 ≤X ≤ l 6 ] p 7 =P [ l 6 ≤X ≤ l 7 ] p 8 =P [ X > l 8 ] where X  Geo (p 0 )

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 0

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 1

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 2

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING

Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Let T be the first passage time to state C n. of observations until the run sum rejects H 0 Let Q be the sub matrix of transient states, then P [T ≤ t] = e ( I – Q t ) J G (s) = se ( I – s Q ) -1 ( I – Q) J E [T] = e ( I – Q ) -1 J e is a row vector defining the initial state {S 0 }

Monitoring High-yield processes Geometric Run sum For this chart a plot of E[T] vs. p is

Monitoring High-yield processes Geometric Run sum A comparison with Test 2

Monitoring High-yield processes RUN SUM – FURTHER IMPROVEMENT Consider a geometric run sum –No regions –Center line equal to l 4 –Scores are equal to X –Design – limit sum L

Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION Let us denote the following cumulative sums SU t = SU t-1 + X t if X t falls above the center line = 0 otherwise SL t = SL t-1 - X t if X t falls below the center line = 0 otherwise

Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION The run sum statistic is defined, for t = 1,2,…, by S t = max {SU t, -SL t } with SU 0 = 0, SL 0 = 0 and limit sum L

Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING Markov chain – not possible – huge number of states Need to derive the distribution of T Can show that

Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING

Monitoring High-yield processes CONCLUSIONS The run sum is an effective procedure for two-sided monitoring For monitoring very small p, it is more effective than a sequence of geometric tests If limited number of regions it can be modeled by a Markov chain

Monitoring High-yield processes TOPICS OF INTEREST Estimate  (the time p changes – the change point) Bayesian tests Lack of independence (chain dependent BT) Run sum can be applied to other instances - monitoring - arrival process

Monitoring High-yield processes REFERENCES Acosta-Mejia, C. A., Pignatiello J. J., Jr. (2010). The run sum R chart with fast initial response. Communications in Statistics – Simulation and Computation, 39: Balakrishnan, N., Koutras, M. V. (2003). Runs and Scans with Applications, J. Wiley, New York, N. Y. Bourke, P. D. (1991). Detecting a shift in fraction nonconforming using run- length control charts with 100\% inspection. Journal of Quality Technology, 23(3), Calvin, T. W. (1983). Quality Control Techniques for Zero-Defects. IEEE Transactions Components, Hybrids and Manufacturing Technology, 6: Champ, C. W., Rigdon, S. E. (1997). An Analysis of the Run Sum Control Chart. Journal of Quality Technology, 29: Reynolds, J. H. (1971). The Run Sum Control Chart Procedure. Journal of Quality Technology, 3:23-27