1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.

Slides:



Advertisements
Similar presentations
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Advertisements

Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
Component Analysis (Review)
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
12 Multiple Linear Regression CHAPTER OUTLINE
Chapter 17 Overview of Multivariate Analysis Methods
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ Chapter 4 RegressionModels.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Data mining and statistical learning, lecture 4 Outline Regression on a large number of correlated inputs  A few comments about shrinkage methods, such.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
CSE 300: Software Reliability Engineering Topics covered: Software metrics and software reliability Software complexity and software quality.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Slide 1 Detecting Outliers Outliers are cases that have an atypical score either for a single variable (univariate outliers) or for a combination of variables.
Simple Linear Regression Analysis
13 Design and Analysis of Single-Factor Experiments:
Separate multivariate observations
Correlation & Regression
1 14 Design of Experiments with Several Factors 14-1 Introduction 14-2 Factorial Experiments 14-3 Two-Factor Factorial Experiments Statistical analysis.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc. Revised talk:
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
17-1 COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN 6 th edition (SIE)
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
1 Statistical Design of Experiments BITS Pilani, November ~ Shilpa Gupta (97A4)
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
1 The Monitoring of Linear Profiles Keun Pyo Kim Mahmoud A. Mahmoud William H. Woodall Virginia Tech Blacksburg, VA (Send request for paper,
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.
Principal Component Analysis (PCA)
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Feature Selection and Extraction Michael J. Watts
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Strategies for Metabolomic Data Analysis Dmitry Grapov, PhD.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Exploring Microarray data
LECTURE 10: DISCRIMINANT ANALYSIS
Principal Components Analysis
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Chapter 3 Multiple Linear Regression
What is Regression Analysis?
Introduction to Probability and Statistics Thirteenth Edition
Principal Components Analysis
LECTURE 09: DISCRIMINANT ANALYSIS
Parametric Methods Berlin Chen, 2005 References:
Control Charts Johnson, R.A. and D.W. Wichern (2007). Applied Multivariate Statistical Analysis, 6th. Ed. Pearson, Upper Saddle River, N.J. Montgomery,
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
What is Artificial Intelligence?
Presentation transcript:

1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.

2 The Challenge The quality of an item or service usually depends on more than one characteristic. When the characteristics are not independent, considering each characteristic separately can give a misleading estimate of overall performance.

3 The Solution Proper analysis of data from such processes requires the use of multivariate statistical techniques.

4 Outline  Multivariate SPC Multivariate control charts Multivariate capability analysis  Data exploration and modeling Principal components analysis (PCA) Partial least squares (PLS) Neural network classifiers  Design of experiments (DOE) Multivariate optimization

5 Example #1 Textile fiber Characteristic #1: tensile strength ± 1 Characteristic #2: diameter ± 0.05

6 Sample Data n = 100

7 Individuals Chart - strength

8 Individuals Chart - diameter

9 Capability Analysis - strength

10 Capability Analysis - diameter

11 Scatterplot

12 Multivariate Normal Distribution

13 Control Ellipse

14 Multivariate Capability Determines joint probability of being within the specification limits on all characteristics

15 Multivariate Capability

16 Capability Ellipse

17 Mult. Capability Indices Defined to give the same DPM as in the univariate case.

18 Test for Normality

19 More than 2 Characteristics Calculate T-squared: where S = sample covariance matrix = vector of sample means

20 T-Squared Chart

21 T-Squared Decomposition Subtracts the value of T-squared if each variable is removed. Large values indicate that a variable has an important contribution.

22 Control Ellipsoid

23 Multivariate EWMA Chart

24 Generalized Variance Chart Plots the determinant of the variance-covariance matrix for data that is sampled in subgroups.

25 Data Exploration and Modeling When the number of variables is large, the dimensionality of the problem often makes it difficult to determine the underlying relationships. Reduction of dimensionality can be very helpful.

26 Example #2

27 Matrix Plot

28 Analysis Methods  Predicting certain characteristics based on others (regression and ANOVA)  Separating items into groups (classification)  Detecting unusual items

29 Multiple Regression

30 Principal Components The goal of a principal components analysis (PCA) is to construct k linear combinations of the p variables X that contain the greatest variance.

31 Scree Plot Shows the number of significant components.

32 Percentage Explained

33 Components

34 Interpretation

35 Principal Component Regression

36 Partial Least Squares (PLS) Similar to PCA, except that it finds components that minimize the variance in both the X’s and the Y’s. May be used with many X variables, even exceeding n.

37 Component Extraction Starts with number of components equal to the minimum of p and (n-1).

38 Coefficient Plot

39 Model in Original Units

40 Classification Principal components can also be used to classify new observations. A useful method for classification is a Bayesian classifier, which can be expressed as a neural network.

41 6 Types of Automobiles

42 Neural Networks

43 Bayesian Classifier  Begins with prior probabilities for membership in each group  Uses a Parzen-like density estimator of the density function for each group

44 Options  The prior probabilities may be determined in several ways.  A training set is usually used to find a good value for .

45 Output

46 Classification Regions

47 Changing Sigma

48 Overlay Plot

49 Outlier Detection

50 Cluster Analysis

51 Design of Experiments When more than one characteristic is important, finding the optimal operating conditions usually requires a tradeoff of one characteristic for another. One approach to finding a single solution is to use desirability functions.

52 Example #3 Myers and Montgomery (2002) describe an experiment on a chemical process: Response variableGoal Conversion percentagemaximize Thermal activityMaintain between 55 and 60 Input factorLowHigh time8 minutes17 minutes temperature160˚ C210˚ C catalyst1.5%3.5%

53 Experiment

54 Step #1: Model Conversion

55 Step #2: Optimize Conversion

56 Step #3: Model Activity

57 Step #4: Optimize Activity

58 Step #5: Select Desirability Fcns. Maximize

59 Desirability Function Hit Target

60 Combined Desirability where m = # of factors and 0 ≤ I j ≤ 5. D ranges from 0 to 1.

61 Example

62 Desirability Contours

63 Desirability Surface

64 Overlaid Contours

65 References  Johnson, R.A. and Wichern, D.W. (2002). Applied Multivariate Statistical Analysis. Upper Saddle River: Prentice Hall.Mason, R.L. and Young, J.C. (2002).  Mason and Young (2002). Multivariate Statistical Process Control with Industrial Applications. Philadelphia: SIAM.  Montgomery, D. C. (2005). Introduction to Statistical Quality Control, 5th edition. New York: John Wiley and Sons.  Myers, R. H. and Montgomery, D. C. (2002). Response Surface Methodology: Process and Product optimization Using Designed Experiments, 2nd edition. New York: John Wiley and Sons.

66 PowerPoint Slides Available at: