NCTS Industrial Statistics Research Group Seminar

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Dimension reduction (1)
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
1 Detection and Analysis of Impulse Point Sequences on Correlated Disturbance Phone G. Filaretov, A. Avshalumov Moscow Power Engineering Institute, Moscow.
STAT 497 APPLIED TIME SERIES ANALYSIS
Mixed Model Analysis of Highly Correlated Data: Tales from the Dark Side of Forestry Christina Staudhammer, PhD candidate Valerie LeMay, PhD Thomas Maness,
Lecture 7: Principal component analysis (PCA)
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Copyright (c) 2009 John Wiley & Sons, Inc.
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
1 Literature Review on Profile Monitoring Jyh-Jen Horng Shiau Institute of Statistics National Chiao Tung University (交通大學統計所 洪志真 ) Sept. 25, 2009 NCTS.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
8-1 Quality Improvement and Statistics Definitions of Quality Quality means fitness for use - quality of design - quality of conformance Quality is.
Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the.
Additional SPC for Variables EBB 341. Additional SPC?  Provides information on continuous and batch processes, short runs, and gage control.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Control Charts for Variables
Inferences About Process Quality
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
15 Statistical Quality Control CHAPTER OUTLINE
Objectives of Multiple Regression
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software.
PARAMETRIC STATISTICAL INFERENCE
Introduction to Statistical Quality Control, 4th Edition
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Digital Media Lab 1 Data Mining Applied To Fault Detection Shinho Jeong Jaewon Shim Hyunsoo Lee {cinooco, poohut,
Chapter 6. Control Charts for Variables. Subgroup Data with Unknown  and 
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,
1 The Monitoring of Linear Profiles Keun Pyo Kim Mahmoud A. Mahmoud William H. Woodall Virginia Tech Blacksburg, VA (Send request for paper,
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Principal Component Analysis (PCA)
WCRP Extremes Workshop Sept 2010 Detecting human influence on extreme daily temperature at regional scales Photo: F. Zwiers (Long-tailed Jaeger)
1 SMU EMIS 7364 NTU TO-570-N Control Charts Basic Concepts and Mathematical Basis Updated: 3/2/04 Statistical Quality Control Dr. Jerrell T. Stracener,
Statistics Presentation Ch En 475 Unit Operations.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 51Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Canadian Bioinformatics Workshops
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Stats Methods at IC Lecture 3: Regression.
Estimating standard error using bootstrap
Chapter 4 Basic Estimation Techniques
Linear Regression.
Basic Estimation Techniques
Statistical Process Control
Basic Estimation Techniques
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Introduction to Instrumentation Engineering
Generally Discriminant Analysis
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

NCTS Industrial Statistics Research Group Seminar Monitoring Nonlinear Profiles with Random Effects by Nonparametric Regression Jyh-Jen Horng Shiau Institute of Statistics National Chiao Tung University (交通大學統計所 洪志真) Sept. 25, 2009 NCTS Industrial Statistics Research Group Seminar

Outline Introduction Linear Profile Monitoring Fixed Effects vs. Random Effects Nonlinear Profile Monitoring Parametric Regression vs. Nonparametric Regression Phase I Monitoring Phase II Monitoring Examples Conclusions

Introduction

SPC: Variables vs. Profiles Classical SPC: using one or multiple quality characteristics (a single univariate or multivariate variable) to measure the process quality. However, in many situations, the response of interest is not a single variable but a function of one or more explanatory variables. This functional response is called a profile. Profile Monitoring

Other Terms for Profiles Waveform Signal (Jin and Shi, 2001) Signature (Gardner et al., 1997) Example: Vertical Board Density Profile Data from Walker and Wright (JQT, 2002) 24 profiles of vertical density, each profile consists of 314 measurements.

explanatory variable (X) Profile Monitoring The objective is to monitor functional data over time. ……… response (Y) n=10 explanatory variable (X) ……… j = 1 j = 2 j = k time j = 1,2,…,k sample profiles, n>2 observations in each profile

Dissolving Process of Aspartame Applications: Dissolving Process of Aspartame An example of a product characterized by a profile is aspartame, an artificial sweetener. An important characteristic of the product is the amount of aspartame that can be dissolved per liter of water at different temperatures. (Kang and Albin, 2000).

Semiconductor Gateoxide Thickness Surface By Gardner et al. (1997) Fig. (a) shows the gateoxide thickness surface of a wafer that was processed under fault-free conditions. Fig. (b) shows the gateoxide thickness surface of a wafer processed under known equipment faults. X and Y are the distances from the center of the wafer. Not only there is an apparent decrease in thickness between the two surfaces (from (a) to (b)), but also a change in spatial pattern.

Tonnage Stamping Process The figure shows the complicated profile form of a Stamping Force Profile given by Jin and Shi (1999). Different local features are needed to be monitored in each interval. Jin and Shi used the term waveform signals to refer to profiles.

Bioassay Dose Response Curve A dose response curve, given by Williams, Birch, Woodall, and Ferry (2006), according to different doses and different time periods. Ith profile, jth dose, kth replication

Boeing (1998) A Location Control Chart USL/LSL: Spec Limits UNTL/LNTL: Sample Mean + /- 3* Sample Std

Two Approaches Parametric Regression Nonparametric Regression Fit a parametric model of known form to each profile Monitor each parameter with a separate chart or Use a multivariate chart based on the vectors of parameter estimates. Nonparametric Regression Smooth each profile data Use various metrics to detect changes in profile shape.

Linear Profile Monitoring Fixed-effect model where are independent and normally distributed with mean 0 and variance . Kang and Albin (2000) Kim, Mahmound, and Woodall (2003) Mahmound and Woodall (2004) Mahmound (2004)

Linear Profiles with Fixed-Effect Model Kang and Albin (2000) Monitor slope and intercept jointly with multivariate chart. Treat the residuals between the sample and reference lines as a rational subgroup and monitor residuals with a combined EWMA/R chart. Phase-I statistics are dependent and thus control limits can not be determined directly from marginal distributions.

Linear Profiles with Fixed-Effect Model Kim et al. (2003) considered the same model but coded the X-values by centering so that the least square estimators of intercept and slope are independent. A two-sided EWMA to monitor intercept A two-sided EWMA to monitor slope A one-sided EWMA to monitor error variance

Why Random Effects? Under the fixed-effect model, the batch effect, the change of humidity or temperature, the characteristics of the measuring equipment, etc., are all included in the error term, which may not be appropriate since these time-varying factors may affect the values of the intercept and slope of the linear profile. By their nature, these hard-to-control factors should be considered as common causes of variations. Should allow profile-to-profile variations.

Linear Profiles with Random Effects (Shiau, Lin, and Chen, 2006) For the ith observation of the jth profile, assume where ~ N( , ), ~ N( , ), ~ N(0, ), , , and are mutually independent. Let the set points be pre-coded so that where

A Simulated Example ~N(3, 0.09), ~N(2, 0.09) and ~N(0, 1). yij Xi

Phase II Method In Phase II, usually assume that are known. Adopt the combined-chart approach. Since are mutually independent, set the individual false-alarm rate of each chart at to achieve an overall false-alarm rate at The control limits of are

When the Fixed-effect Model is Mistakenly Used Real (random effect) Misuse (fixed effect) Upper control limit of A0 4.101054 3.469491 Lower control limit of A0 1.898946 2.530509 Upper control limit of A1 2.996472 2.032534 Lower control limit of A1 1.003528 1.967466 Upper control limit of error variance 1.759881 ARL0 370.370370 1.078406 Using the wrong model causes incredibly many false alarms! ( 92.73 % are false alarms!)

Linear Profiles with Random Effects Phase I --- Estimation Under the random-effect model with coded xi’s, are i.i.d. are i.i.d. These statistics are mutually independent.

Phase-I Monitoring Statistics The three monitoring statistics : where

Bonferroni Method profiles in the Phase I historical data set. If we control each individual false-alarm rate at level then the overall false-alarm rate for the profiles is controlled at level Control limits using the Bonferroni Method: where

Evaluation Criteria for Phase I Methods Main concern in evaluating Phase I methods Effectiveness in detecting out-of-control profiles correctly. Commonly used criterion – “signal probability” Include both true and false alarms. Proposed criteria “True-alarm rate” – the rate of detecting real out-of-control profiles. “False-alarm rate” – the rate of claiming in-control profiles out of control.

Bonferroni vs. Multiple FDR The Multiple FDR method An extension of FDR (False Discovery rate) The Multiple FDR method is better than the Bonferroni method in terms of detecting power, especially when there are more out-of-control profiles in the historical data. The tradeoff is the slightly larger false-alarm rate, but still very small (less than 0.003).

Other Related Works Mahmoud A. Mahmoud and William H. Woodall (2004). “Phase I Analysis of Linear Profiles with Calibration Applications”. Technometrics, Nov. 2004. Mahmoud A. Mahmoud, Peter A. Parker, William H. Woodall and Douglas M. Hawkins (2006). “A Change Point Method for Linear Profile Data”. Qual. Reliab. Engng. Int. 2006. CHRISTINA L. STAUDHAMMER, VALERIE M. LEMAY, ROBERT A. KOZAK, and THOMAS C. MANESS (2005). ” MIXED-MODEL DEVELOPMENT FOR REAL-TIME STATISTICAL PROCESS CONTROL DATA IN WOOD PRODUCTS MANUFACTURING”. FBMIS Volume 1, 2005, 19-35. Wang, K. and Tsung, F. (2005). “Using Profile Techniques for a data-rich Environment with Huge Sample Size”. Quality and Reliability Engineering International, 21, 7, 677-688. WILLIS A. JENSEN, JEFFREY B. BIRCH, and WILLIAM H. WOODALL (2006). “Profile Monitoring via Linear Mixed Models” JSM 2006 Online Program.

Nonlinear Profile Monitoring by Parametric Regression

Related Works Jensen, W. A. Woodall, W. H, and Birch, J. B.(2003). "Phase I Monitoring of Nonlinear Profiles". Ding, Y., Zeng, L., and Zhou, S., (2005). “Phase I Analysis for Monitoring Nonlinear Profile signals in Manufacturing Processes”, Journal of Quality Technology, 38(3), 199-216. WILLIS A. JENSEN and JEFFREY B. BIRCH (2006). “Profile Monitoring via Nonlinear Mixed Models”. Technical Report. J. D. Williams, J. B. Birch, W. H. Woodall, and N. M. Ferry (2006). “Statistical Monitoring of Heteroscedastic Dose-Response Profiles from High-throughput Screening”, JSM 2006 Online Program. Shiau, J.-J. H., Yen, C.-L., and Feng, Y.-W. (2006). “A New Robust Phase I Analysis for Monitoring of Nonlinear Profiles. Technical Report.

Nonlinear Profile Monitoring via Nonparametric Regression Fixed Effects Random Effects

Nonparametric Regression Consider the following nonparametric regression model: where m(x) is a smooth regression curve and ’ s are i.i.d. normal variates with mean zero and common variance . With B-spline regression, the model is replaced by: is the unknown B-spline coefficient of the lth B-spline basis to be estimated from data. Estimate by

Nonparametric Fixed-Effect Model Shiau and Weng (2004) Simulated example: where are fixed constants. Apply B-spline regression to each sample profile. Monitor mean shifts EWMA chart The EWMA statistic of the jth profile with smoothing constant :

Nonparametric Fixed Effect Model Monitor variation change R chart The R statistic of the jth profile (use range) : Another chart for variation change EWMSD where where The EWMSD statistic: where where . . where . .

Fixed vs. Random Effects Fixed-effect model No profile-to-profile (subject-to-subject) variation The function is a fixed function, same for each profile. Random-effect model There exists profile-to-profile variation caused by common causes. The profile function is a random function. Profiles are modeled as realizations of a stochastic process with a mean curve and a covariance function.

Nonparametric Random-Effect Model Shiau, J.-J. H., Huang, H.-L., Lin, S.-H., and Tsai, M.-Y. (2009). “Monitoring Nonlinear Profiles with Random Effects by Nonparametric Regression”. Communications in Statistics-Theory and Methods. 38, 1664-1679.

Nonparametric Random-Effect Model Adopt the random-effect model to provide more variability we often observe in many profile data. Motivated example: aspartame Original model to generate aspartame profiles: where i.i.d. i.i.d. Random Variables ! Represent common-cause variations among profiles. i.i.d. i.i.d.

Original Model

Problems with the Original Model Not Gaussian Covariance matrix depends on Too complicated to analyze

Stochastic Gaussian Process Model Gaussian process with Mean function Covariance function In-control process

Out-of-control Process When the mean function is shifted, say,

Data Smoothing Smoothing splines or B-spline regression Preprocessing: Smooth each profile. Smoothing splines or B-spline regression Other smoothing techniques: kernel smoothing, local polynomial smoothing, wavelets After sample profiles are de-noised (i.e., to eliminate the effects of ), we have the smoothed profiles : represents the profile-to-profile variations

Principal Component Analysis (PCA) Method: Apply principal component analysis (PCA) on to obtain the principal modes of variations PCA is to find an orthogonal matrix such that Eigen-analysis Eigenvectors are principal components

Phase I Monitoring (1) A set of historical profiles is available Smooth each profile Apply PCA to sample covariance matrix Eigenvectors are principal components (PC) PC-score

Phase I Monitoring (2) Select “effective” principal components by Total variation explained Choose the first K such that reaches a desired level Parsimoniousness Score vector of the th profile

Phase I Monitoring (3) Hotelling statistics The usual sample mean and sample covariance matrix of the score vectors

Phase I Monitoring (4) Since score vectors are asymptotically multivariate normal, we have upper control limit of chart:

Phase I Monitoring (5) Note that the monitoring statistics across curves in Phase-I are not independent. So the prescribed overall false-alarm rate cannot be achieved by the marginal distribution of the monitoring statistics. We can adopt the Bonferroni approach to control the overall false-alarm rate (i.e., type I error) at level .

Phase II Monitoring (1) In Phase II, we usually assume that is known. In practice, is estimated by the sample covariance matrix of Phase-I in-control profiles. Apply PCA to to obtain eigenvalues and eigenvectors Choose K effective PCs

Phase II Monitoring (2) Now for the new incoming profile First smooth then project it onto the K PCs to obtain K independent PC-scores: If the process is in control

Individual PC-score Charts rth PC-score chart Monitoring Statistic: Control limits:

A Combined Chart Signals when any of the K individual charts signals Equivalent to monitoring the statistic: Control limits: where individual false alarm rate is set at so that overall false alarm rate is

A Chart Monitoring statistic Follows chi-square distribution with K degrees of freedom Upper control limit:

Performance Evaluation for Phase II Average Run Length (ARL) Mean shift from probability of detecting the shift ARL =1/p

Individual chart Combined chart A chart

? More PC-scores More detecting power Percentage of Explanation Detecting Power In 50 curves, there is one outlier with shifted from 1 to 1+5*0.2.

A Simulated Aspartame Example--Phase I Monitoring

ARL Comparisons--Phase II Monitoring

A Case Study--VDP Example

Conclusions We propose and discuss monitoring schemes for nonlinear profiles based on PCA: Phase I Hotelling control chart Phase II individual PC-score charts combined chart chart

Conclusions When the shift corresponds to a mode of variation that a particular principal component represents use the individual PC-score chart for better power Unfortunately, this ideal situation is rare in practice.

Conclusions The chart performs somewhat better than the combined chart in terms of the average run length, but not too far off. However, by providing charts for all of the effective components, the combined chart gives more clues for finding assignable causes than the chart.

Conclusions Degree of smoothness in the data smoothing step has a great impact on the result of the subsequent PCA step. High degree of smoothness leads to high total explanation power of the first few principal components For B-spline regression, # B-spline bases = # principal components with nonzero eigenvalues. If the underlying profiles (i.e., with no noises) are fairly smooth, then the data dimension can be well reduced by PCA.

Conclusions Profile monitoring has become a popular and promising area of research in statistical process control in recent years. At the same time, functional data analysis (FDA) is also gaining lots of attentions and applications. We believe many techniques developed for FDA may be extended to developing new profile monitoring techniques in SPC.

More Recent Works Two master theses of 2009 Monitoring profiles by their Data Depths of PC-scores

The End Thank you for listening

Other Related Works JIN, J. and SHI, J. (2001). “Automatic Feature Extraction of Waveform Signals for In-Process Diagnostic Performance Improvement”. Journal of Intelligent Manufacturing 12, 257-268. LADA,E.K.; LU, J. –C.; and WHSON, J.R. (2002) “A Wavelet-Based Procedure for Process Fault Detection”. IEEE Transactions on semiconductor Manufacturing 15, 79-90. M. K. JEONG, J.-C. LU and N. WANG (2006). “Wavelet-Based SPC Procedure for Complicated Functional Data”. International Journal of Production Research, Vol. 44, No. 4, 729–744. Shiyu Zhou, Baocheng Sun, and Jianjun Shi (2006). “An SPC Monitoring System for Cycle-Based Waveform Signals Using Haar Transform”. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 1

Phase I Monitoring of Nonlinear Profiles James D. Williams, William H. Woodall and Jeffrey B. Birch, 2003 •Method 1 (sample covariance matrix) does not take into account the sequential sampling structure of the data: The overall probability of detecting a shift in the mean vector will decrease (See Sullivan and Woodall, 1996) Should not be used •Method 2 (successive differences) accounts for the sequential sampling scheme, and gives a more robust estimate of the covariance matrix •In the VDP example, both Methods 1 and 2 gave same result because No apparent shift in the mean vector There were only about two outliers •Method 3 (intra-profile pooling) should be used when there is no profile-to-profile common cause variability •Comparison of the three methods: Method 1 assumes all variability is due to common cause Method 3 assumes that no variability is due to common cause Method 2 is somewhere in the middle