The Application of Partial Least Squares to Non-linear Systems in the Process Industries Elaine Martin and Julian Morris Centre for Process Analytics and.

Slides:



Advertisements
Similar presentations
PROCESS PERFORMANCE MONITORING IN THE PRESENCE OF CONFOUNDING VARIATION Baibing Li, Elaine Martin and Julian Morris University of Newcastle Newcastle upon.
Advertisements

Design of Experiments Lecture I
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy

Modelling unknown errors as random variables Thomas Svensson, SP Technical Research Institute of Sweden, a statistician working with Chalmers and FCC in.
Kriging.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Introduction to Statistical Quality Control, 4th Edition Chapter 7 Process and Measurement System Capability Analysis.
Manchester University Electrical and Electronic Engineering Control Systems Centre (CSC) A COMPARITIVE STUDY BETWEEN A DATA BASED MODEL OF LEAST SQUARES.
Data mining and statistical learning - lecture 6
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
Curve-Fitting Regression
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Development of Empirical Models From Process Data
Transfer Functions Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: The following terminology.
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Elaine Martin Centre for Process Analytics and Control Technology University of Newcastle, England The Conjunction of Process and.
Course AE4-T40 Lecture 5: Control Apllication
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Classification and Prediction: Regression Analysis
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Adaptive Signal Processing
Calibration & Curve Fitting
Radial Basis Function Networks
Introduction to Statistical Quality Control, 4th Edition Chapter 7 Process and Measurement System Capability Analysis.
BsysE595 Lecture Basic modeling approaches for engineering systems – Summary and Review Shulin Chen January 10, 2013.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Confidence Interval Estimation
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
WB1440 Engineering Optimization – Concepts and Applications Engineering Optimization Concepts and Applications Fred van Keulen Matthijs Langelaar CLA H21.1.
Curve-Fitting Regression
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Digital Media Lab 1 Data Mining Applied To Fault Detection Shinho Jeong Jaewon Shim Hyunsoo Lee {cinooco, poohut,
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Quadrature rules 1Michael Sokolov / Numerical Methods for Chemical Engineers / Numerical Quadrature Michael Sokolov ETH Zurich, Institut für Chemie- und.
1 Analysis Considerations in Industrial Split-Plot Experiments When the Responses are Non-Normal Timothy J. Robinson University of Wyoming Raymond H. Myers.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
1 Modelling procedures for directed network of data blocks Agnar Höskuldsson, Centre for Advanced Data Analysis, Copenhagen Data structures : Directed.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Correlation & Regression Analysis
CpSc 881: Machine Learning
Linearizability of Chemical Reactors By M. Guay Department of Chemical and Materials Engineering University of Alberta Edmonton, Alberta, Canada Work Supported.
Case Selection and Resampling Lucila Ohno-Machado HST951.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chapter 8: Adaptive Networks
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Sensitivity Analysis for the Purposes of Parameter Identification of a S. cerevisiae Fed-batch Cultivation Sensitivity Analysis for the Purposes of Parameter.
Nonlinear balanced model residualization via neural networks Juergen Hahn.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
TAUCHI PHILOSOPHY SUBMITTED BY: RAKESH KUMAR ME
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
Chapter 7. Classification and Prediction
PSG College of Technology
Statistical Methods For Engineers
Filtering and State Estimation: Basic Concepts
Predict Failures with Developer Networks and Social Network Analysis
Product moment correlation
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
Statistical Thinking and Applications
Presentation transcript:

The Application of Partial Least Squares to Non-linear Systems in the Process Industries Elaine Martin and Julian Morris Centre for Process Analytics and Control Technology CPACT School of Chemical Engineering and Advanced Materials University of Newcastle, England

Overview of the Presentation n Motivation for the Application of “Data Mining” in Non-linear Process Systems n Process Modelling and Analysis of Non-linear Systems l Constrained Partial Least Squares l Local Linear Modelling n Prediction Intervals for Non-linear Partial Least Squares n Conclusions

Data Rich Information Poor Enhanced Profitability and Improved Customer Satisfaction Modern Process Control Systems Process Optimisation Process Monitoring for Early Warning and Fault Detection

Process Modelling n Mechanistic models developed from process mass and energy balances and kinetics provide the ideal form given: u process understanding exists u time is available to construct the model. n Data based models are useful alternatives when there is: u limited process understanding u process data available from a range of operating conditions. n Hybrid models combine the two approaches.

Process Modelling n Traditionally two types of variables have been used in the development of a process model/process performance monitoring scheme: u Process variables (X) u Quality variables (Y) n In practice, a third class exists: u Confounding variables (Z). n A confounding variable is any extraneous factor that is related to, and affects, the two sets of variables under study (X) and (Y). n It can result in a distortion of the true relationship between the two sets of variables, that is of primary interest.

Global Process Variation Confidence ellipse including confounding variation Trajectory of confounding variable Confidence ellipse excluding confounding variation X X X X X X X Mal-operation

Partial Least Squares X-block outer relationship (monitoring) Y-block outer relationship (monitoring) Inner relationship (prediction) X and Y-block scores are calculated recursively

Constrained PLS n To exclude the nuisance source of variability, a necessary condition is that the derived latent variables,, and, are not correlated with the confounding variables: and for. n The idea of constrained PLS is to apply the constraints to ordinary PLS.

Constrained PLS n Standard constrained optimisation techniques can be used to solve the equations in each iteration. n An algorithm has been developed that enhances the efficiency of the constrained PLS algorithm. n The other steps of constrained PLS are as for ordinary PLS. n The resulting latent variables can then be used for process monitoring with the knowledge that they are not confounded with the nuisance source of variability. n Any unusual variation detected from these latent variables can then be assumed to be related to abnormal process behaviour.

Industrial Application n An industrial semi-discrete batch manufacturing operation is used to illustrate the advantages of the constrained PLS algorithm over ordinary PLS. n The process involves the production of a variety of products (recipes), some of which are only manufactured in small quantities to meet the requirements of specialised markets. n The objective of the analysis was to build a monitoring scheme to detect the onset of subtle changes in production and final product quality.

An Industrial Application n For simplicity, three recipes are selected to demonstrate the methodology. n A total of thirty-six process variables, including flow rates, pressures and temperatures, are recorded every minute, whilst five quality variables are measured off-line in the quality laboratory every two hours. n A nominal process monitoring scheme was developed using both ordinary PLS and constrained PLS from 41 ‘ideal’ batches. n A further 6 batches, A4, A10, A29, A35, A38 and B32 were used for model validation. These batches were known to lie outside the desirable specification limits.

Industrial Application Ordinary Partial Least Squares Latent variable 1 V Latent variable 2Latent variable 3 V Latent variable 4

Industrial Application Ordinary Partial Least Squares Bivariate Scores PlotHotelling’s T 2 and SPE

Industrial Application LV 1 versus LV 2LV 3 versus LV 4 Constrained Partial Least Squares

Industrial Application Hotelling’s T 2 Squared Prediction Error Constrained Partial Least Squares

Constrained PLS - Conclusions n Constrained PLS possesses the following important characteristics: l It removes that information correlated with the confounding variables. l The information excluded by constrained PLS contains only variation associated with the confounding variables. l The derived constrained PLS latent variables achieve optimality in terms of extracting as much of the available information as possible contained in the process and quality data.

Local Linear and Non-linear Multi-way Partial Least Squares Batch Monitoring

Batch Process Modelling and Monitoring n Batch processes exhibit non-linear, time variant and dynamic behaviour. n These characteristics challenge the linear multivariate statistical technique of multi-way Partial Least Squares (PLS) that has traditionally been applied in batch process performance monitoring. n A local model based approach has been developed to overcome these limitations.

Local Model Approach n Batch processes often exhibit distinct phases of process operation thus instead of modelling a non-linear time variant batch process as a global model, batch trajectories are sub- divided into individual operating regions. n A local linear PLS model is then developed for each operating region l Each model can comprise a different number of latent variables. n A validity function then creates a smooth transition between the local models to build a global non-linear model.

Validity Function n The validity function determines which operating region the process lies within at each time point: l Identification of the most appropriate local model l Weighting of local models if two or more are applicable n The validity function is based on a fuzzy logic rule based function: l Rules based on process variable behaviour IF x 1 is LOW AND x 2 is HIGH THEN model 1 is valid

Dynamic Feature Addition n Batch process variables also exhibit serial and cross correlation. n Auto Regressive with eXogenous inputs (ARX) structure is a time series structure used to model such data n Including past input and output process variables into the X data matrix of a PLS model encapsulates some of the dynamic features within the model.

n A fed-batch fermentation process is used to demonstrate local model performance monitoring. n 17 batches with good operating conditions and high yield were selected for the nominal model. n 30 batches with standard operating conditions but mid to low yield were used to assess the monitoring charts. n A model was developed using local dynamic PLS and global dynamic PLS. Application to an Industrial Process

Operating Region Specification n Operating regions specified using process knowledge l 4 operating regions identified n Regions based on conditions within the fermenter l Operating region 1: initial start up of the fermenter before optimum conditions are reached l Operating region 2: initialisation of product growth l Operating region 3: maximum growth rate of product l Operating region 4: reactions are complete

Operating Region Specification Addition rate of chemical A pH Potency

Validity Function n Fuzzy logic rules used to determine movement between operating regions l Rules applied to u Power, Substrate Addition Rate, Respiration Rate

Global Dynamic PLS Predicted and Actual Values of Potency Residuals of Global Dynamic PLS Model

Prediction using Local Dynamic PLS Model observation number potency Predicted and Actual Potency for Each Model Residuals of Local Dynamic PLS models

Performance Monitoring and Fault Detection Local SPE chart - varying control limit Global SPE chart - constant control limit

Fault Detection False alarm Process fault detected Local SPE chart Global SPE chart

Conclusions n Inclusion of dynamic behaviour improves model performance through the removal of process structure within the model n Fuzzy model rule based validity function approach allows batch specific movement between model n Local model approach to performance monitoring leads to control charts with improved model limits n Local model monitoring charts detect faults and process deviations earlier than the global model equivalent

Non-linear Partial Least Squares Prediction Intervals and Leverage

Non-linear Partial Least Squares n A simple approach to non-linear PLS has been to extend the input matrix (X) by including non-linear combinations of the original variables (such as logarithms, square values, cross- products, etc.) and then performing linear PLS. n If there is no a priori knowledge, then there is no limitation as to the number (and kind) of transformation that might be applied. n Thus by pre-treating data sets in this way, the number of non- linear terms can increase excessively, resulting in large input and output matrices and the results become difficult to interpret.

Non-linear Partial Least Squares n A more structured approach to the development of a non-linear PLS model is to modify the NIPALS algorithm by introducing a non-linear function that relates the output scores u to the input scores t, without modifying the input and output variables: n Wold et al (1989) proposed a non-linear PLS algorithm which retained the framework of linear PLS but that used second order polynomial (quadratic) regression: u j = c 0j + c 1j t j + c 2j t j 2 + e j

Prediction Intervals for Non-linear PLS n As for every regression technique, a measure for assessing the reliability of the predicted values is required. n A common approach is through the use of prediction intervals. These are the upper and lower confidence limits of the predicted values. n The larger the magnitude of these intervals, the less precise is the prediction. n A methodology used to evaluate prediction intervals for neural network models has been extended to linear and non-linear partial least squares algorithms.

Calculation of Prediction Intervals n The prediction intervals are computed using a first order Taylor series expansion and the Jacobian matrix of the functional mapping provided by the PLS algorithms. n Given a set of input and output training data, X and Y, respectively, a PLS regression model is built and the Jacobian matrix F is computed for the same set of training data n When the PLS regression model is used to predict a new output value, corresponding to a new sample of input variables, the vector of partial derivatives is computed and the prediction interval is evaluated

Case Study n The data were generated from the simulation of a pH neutralisation system. n Samples were collected under steady state operating conditions, thus no time correlation existed between any two consecutive samples. n The data included four input variables (flowrates of the inlet and outlet streams of the neutralisation tank) and one output variable (pH value measured in the outlet stream) and were noise free.

pH Neutralisation Process

Radial Basis Function PLS n An error based up-dating partial least squares radial basis function PLS model was built using 350 data samples. n It was constructed from one latent variable with twenty one nodes included in the inner radial basis function model. n In excess of 99% of the total variance of the output variable was captured by this representation.

Radial Basis Function PLS Time Series Plot for the Test Data with Predictions

Leverage n The quantity is similar in form to leverage. n It can be used to provide an additional metric for assessing the quality of the regression model. n This is achieved by computing the critical value of the chi-square distribution with degrees of freedom, for predefined confidence levels, e.g. 95% and 99%, and plotting the value of for each sample and the critical value of the distribution divided by (n-p).

Leverage n When the ‘leverage’ is smaller than the critical value, the corresponding predicted value is considered to be reliable with the predefined confidence level and vice versa, when the ‘leverage’ is larger than the limit, the predicted value is considered to be unreliable.

Radial Basis Function PLS Leverage for the Test Data Prediction Intervals

Conclusions - PLS Prediction Intervals n A methodology proposed for prediction intervals in neural network modelling was extended to non-linear PLS algorithms. n This approach was known to give approximate, but generally reliable, results whilst being less computationally expensive than other more mathematically precise approaches such as the likelihood, lack-of-fit, jackknife and bootstrap. n The development of the algorithm led to the definition of a metric, the leverage, which can be used in conjunction with, or as an alternative to, prediction intervals.

Conclusions DATA RICH INFORMATION POOR DATA INFORMATION KNOWLEDGE

Acknowledgements n EBM acknowledges Dr Pino Baffi, Dr Baibing Li, Miss Nicola Fletcher and colleagues in CPACT for the many stimulating discussions. n EBM acknowledges colleagues at BASF Ag. for stimulating the research, in particular Gerhard Krennrich and Pekka Teppola. n EBM acknowledges Pfizer for providing the data.