Presentation is loading. Please wait.

Presentation is loading. Please wait.

RENSSELAER PLS: PARTIAL-LEAST SQUARES PLS: - Partial-Least Squares - Projection to Latent Structures - Please listen to Svante Wold Error Metrics Cross-Validation.

Similar presentations


Presentation on theme: "RENSSELAER PLS: PARTIAL-LEAST SQUARES PLS: - Partial-Least Squares - Projection to Latent Structures - Please listen to Svante Wold Error Metrics Cross-Validation."— Presentation transcript:

1 RENSSELAER PLS: PARTIAL-LEAST SQUARES PLS: - Partial-Least Squares - Projection to Latent Structures - Please listen to Svante Wold Error Metrics Cross-Validation - LOO - n-fold X-Validation - Bootstrap X-Validation Examples: - 19 Amino-Acid QSAR - Cherkassky’s nonlinear function - y = sin|x|/|x| Comparison with SVMs

2

3 IMPORTANT EQUATIONS FOR PLS RENSSELAER t’s are scores or latent variables p’s are loadings w 1 eigenvector of X T YY T X t 1 eigenvector of XX T YY T w’s and t’s of deflations: w’s are orthonormal t’s are orthogonal p’s not orthogonal p’s orthogonal to earlier w’s

4 IMPORTANT EQUATIONS FOR PLS

5 NIPALS ALGORITHM FOR PLS (with just one response variable y) RENSSELAER Start for a PLS component: Calculate the score t: Calculate c’: Calculate the loading p: Store t in T, store p in P, store w in W Deflate the data matrix and the response variable: Do for h latent variables

6 The geometric representation of PLSR. The X-matrix can be represented as N points in the K dimensional space where each column of X (x_k) defines one coordinate axis. The PLSR model defines an A-dimensional hyper-plane, which in turn, is defined by one line, one direction, per component. The direction coefficients of these lines are p_ak. The coordinates of each object, i, when its ak data (row i in X) are projected down on this plane are t_ia. These positions are related to the values of Y.

7 QSAR DATA SET EXAMPLE: 19 Amino Acids From Svante Wold, Michael Sjölström, Lennart Erikson, "PLS-regression: a basic tool of chemometrics," Chemometrics and Intelligent Laboratory Systems, Vol 58, pp. 109-130 (2001) RENSSELAER

8 INXIGHT VISUALIZATION PLOT RENSSELAER

9

10 QSAR.BAT: SCRIPT FOR BOOTSTRAP VALIDATION FOS AA’s

11 1 latent variable

12 2 latent variables

13 3 latent variables

14 1 latent variable No aromatic AAs

15

16 w 1 eigenvector of X T YY T X t 1 eigenvector of XX T YY T w’s and t’s of deflations: w’s are orthonormal t’s are orthogonal p’s not orthogonal p’s orthogonal to earlier w’s Linear PLS Kernel PLS trick is a different normalization now t’s rather than w’s are normalized t 1 eigenvector of K(XX T) YY T w’s and t’s of deflations of XX T KERNEL PLS HIGHLIGHTS Invented by Rospital and Trejo (J. Machine learning, December 2001) They first altered the linear PLS by dealing with eigenvectors of XX T They also made the NIPALS PLS formulation resemble PCA more Now non-linear correlation matrix K(XX T ) rather than XX T is used Nonlinear Correlation matrix contains nonlinear similarities of datapoints rather than An example is the Gaussian Kernel similarity measure:

17

18 1 latent variable Gaussian Kernel PLS (sigma = 1.3) With aromatic AAs

19

20

21

22 CHERKASSKY’S NONLINEAR BENCHMARK DATA Generate 500 datapoints (400 training; 100 testing) for: Cherkas.bat

23 Bootstrap Validation Kernel PLS 8 latent variables Gaussian kernel with sigma = 1

24

25 True test set for Kernel PLS 8 latent variables Gaussian kernel with sigma = 1

26 Y=sin|x|/|x| Generate 500 datapoints (100 training; 500 testing) for:

27 Comparison Kernel-PLS with PLS 4 latent variables sigma = 0.08 PLS Kernel-PLS

28


Download ppt "RENSSELAER PLS: PARTIAL-LEAST SQUARES PLS: - Partial-Least Squares - Projection to Latent Structures - Please listen to Svante Wold Error Metrics Cross-Validation."

Similar presentations


Ads by Google