Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,

Similar presentations


Presentation on theme: "1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,"— Presentation transcript:

1 1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology, Isfahan 84154, Iran

2 2 Outline Introduction Introduction Prediction of component concentrations in Claus data using PARAFAC and N-PLS multi-way methods Prediction of component concentrations in Claus data using PARAFAC and N-PLS multi-way methods original data original data (original + noise) data (original + noise) data denoised data (using wavelet as a denoising method) denoised data (using wavelet as a denoising method) - Homoscedastic noise (level independent method) - Homoscedastic noise (level independent method) - Hetroscedastic noise (level dependent and minimum - Hetroscedastic noise (level dependent and minimum description length) description length) Conclusions Conclusions

3 3 Noise definition ●Noise is any component of a signal which impedes observation, detection or utilization of the information that the signal is carrying. ●Noise is measured by its standard deviation or peak to peak fluctuation

4 4 Different types of noise Hetroscedastic Homoscedastic Noise

5 5 Homoscedastic and Hetroscedastic Noise ● Homoscedastic noise: Noise is independent of variable, sample and signal with the normal distribution and a constant variance. Noise is independent of variable, sample and signal with the normal distribution and a constant variance. ● Hetroscedastic Noise: Noise is dependent on the variable, sample and signal. Noise is dependent on the variable, sample and signal. (Noise from different variables or samples can be correlated) (Noise from different variables or samples can be correlated) 1 2 R 1111 2222 2 2

6 6 Least squares method Homoscedastic noise :  ij is constant, uniform and independent of the signal, variables and samples Hetroscedastic noise :  ij is dependent on signal, variables or samples

7 7 Hetroscedastic noise in Univariate and Multivariate Calibration Methods ● Zeroth order calibration weighted linear regression weighted linear regression ● First order calibration weighted principle component analysis weighted principle component analysis ● Second order calibration Positive matrix factorization Positive matrix factorization Maximum likelihood PARAFAC Maximum likelihood PARAFAC

8 8 Claus data “fluorescence Spectroscopy” Analyte 1 (tyrosine) Analyte 2 (tyrptophane) Analyte 3 (phenyl alanine) Sample 1 2.7×10 -6 00 Sample 2 0 1.33×10 -5 0 Sample 3 00 9.0×10 -4 Sample 4 1.6×10 -6 5.4×10 -6 3.55×10 -4 Sample 5 9.0×10 -7 4.4×10 -6 2.97×10 -4 C. A. Andesson and R. Bro. The N-way Toolbox for MATAB Chemom. Intell. Lab. Sys. 2000, 52 (1), 1- 4 http://www. models. kvl. dk

9 9 Fluorescence excitation and emission spectrum of five samples

10 10 X 4 201 61 = ++ a1a1a1a1 b1b1b1b1 c1c1c1c1 c2c2c2c2 c3c3c3c3 a2a2a2a2 b2b2b2b2 b3b3b3b3 a3a3a3a3 Claus data PARAFAC: four samples were used for modeling Score (a1) (a1) Concentration analyte 1 Score (a2) (a2) Concentration analyte 2 Score (a3) (a3) Concentration analyte 3

11 11 Calculation of Score for a New Sample 61 201 Z = kr (B, C) Un = reshape (Un, 12261, 1) Score Un = pinv(Z) * Un Un =

12 12 Relative Errors of Predicted Concentrations for Samples 4 & 5 (without adding noise) Analyte 1 Analyte 2 Analyte 3 Sample 4 -0.75-7.39.4 Sample 5 -0.56-6.114.3

13 13 Generating of Noise Matrix 5 201×61 (Claus data) 5Noise Homoscedastic nois: Standard deviation of noise = 2%, 5%, 10% of the maximum value in the claus data Hetroscedastic noise : N = N(0,1). * 1/10 X Element by element was multiplied by one-tenth of the claus data 201×61 Claus data + Noise 4 201 61 unfolding

14 14 Homoscedastic and Hetroscedastic noise were added to original data Hetroscedastic noise (10%) Homoscedastic noise (10%)

15 15 Reshape of Sample One Sample one with adding Homoscedastic noise The effect of adding Homoscedastic noise

16 16 Reshape of Sample One Sample one with adding Hetroscedastic noise The effect of adding Hetroscedastic noise

17 17 Wavelet can be used as a powerful tool for signal denoising Wavelet Denoising : ● Wavelet decomposition of the signal ● Selecting the threshold ● Applying the threshold to the wavelet coefficients ● Inverse transformation to the native domain

18 18 Thresholding methods : ● Global thresholding ● Level dependent thresholding ● Data dependent thresholding ● Cycle – spin thresholding ● Wavelet packet thresholding

19 19 Universal threshold : N = length of data array Xi = detail part of coefficient

20 20

21 21 Prediction of Analyte Concentrations for Samples 4 & 5 using PARAFAC

22 22 Comparison of Sum of the Square of Residuals (Homoscedastic noise - PARAFAC) SSR Model 1 SSR SSR Model 2 Without noise 10.510.2 Noisy data Homo. noise 2% 168.56168.57 Homo. noise 5% 168.56168.57 Homo. noise 10% 168.56168.57 Denoised data Homo. noise 2% 14.3415.05 Homo. noise 5% 37.2437.29 Homo. noise 10% 105.36103.97 model 1 : sample 1, 2, 3, 4 / model 2 : sample 1, 2, 3, 5 Each number × 10 5

23 23 Var. Model 1 Var. Var. Model 2 Without noise 99.9499.95 Noisy data Homo. noise 2% 99.0799.19 Homo. noise 5% 99.0799.19 Homo. noise 10% 99.0799.19 Denoised data Homo. noise 2% 99.9399.92 Homo. noise 5% 99.8299.79 Homo. noise 10% 99.4999.42 Comparison of explained variation (Homoscedastic noise - PARAFAC)

24 24 Relative Errors of Predicted Concentrations for Sample 4 ( Homoscedastic noise – PARAFAC ) Analyte 1 Analyte 2 Analyte 3 0 % noise -0.75-7.39.4 Noisy data 2 % noise -0.41 -0.41-7.19.2 5 % noise -0.41-7.19.26 10 % noise -0.41-7.19.26 Denoised data 2 % noise -0.77-7.59.3 5 % noise -0.66-7.58.5 10 % noise -1.3-7.17.3

25 25 Analyte 1 Analyte 2 Analyte 3 0 % noise -0.75-7.39.4 Noisy data 2 % noise -0.71-6.114.2 5 % noise -0.71-6.114.2 10 % noise -0.71-6.116.9 Denoised data 2 % noise -0.55-6.114.3 5 % noise -1.7-6.413.6 10 % noise -2.8-5.311.6 Relative Errors of Predicted Concentrations for Sample 5 ( Homoscedastic noise - PARAFAC )

26 26 Comparison of Sum of the Square of Residuals (Hetroscedastic noise) SSR Model 1 SSR Model 2 Without noise 10.510.2 Noisy data Hetro. noise 10% 35.5239.26 Hetro. noise 20% 110.27126.17 Denoised data Hetro. noise 10% 33.5236.56 Hetro. noise 20% 101.32113.78 (Each number * 10 5 ) wavelet denoising (level dependent method)

27 27 Comparison of explained variation (Hetroscedastic noise - PARAFAC) Var. Model 1 Var. Model 2 Without noise 99.9499.95 Noisy data Hetro. noise 10% 99.8099.81 Hetro. noise 20% 99.3999.39 Denoised data Hetro. noise 10% 99.8199.82 Hetro. noise 20% 99.4499.45 wavelet denoising (level dependent method)

28 28 Relative Errors of Predicted Concentrations for sample 4 ( Hetroscedastic noise - PARAFAC) Analyte 1 Analyte 2 Analyte 3 0 % noise -0.7-7.39.4 Noisy data 10 % noise -0.85-7.39.3 20 % noise -0.94-7.09.3 Denoised data 10 % noise -0.86-7.169.28 20 % noise -0.94-7.109.15 wavelet denoising (level dependent method)

29 29 Relative Errors of Predicted Concentrations for Sample 5 ( Hetroscedastic noise - PARAFAC) Analyte 1 Analyte 2 Analyte 3 0 % noise -0.56-6.114.3 Noisy data 10 % noise -0.53-6.114.2 20 % noise -0.45-6.214.1 Denoised data 10 % noise -0.54-6.214.15 20 % noise -0.47-6.0414.00 wavelet denoising (level dependent method)

30 30 Minimum Description Length The MDL is an approach to simultaneous noise suppression and signal compression.The MDL is an approach to simultaneous noise suppression and signal compression. It is free from any parameter setting such as threshold selection, which can be particularly useful for real data where the noise level is difficult to estimate.It is free from any parameter setting such as threshold selection, which can be particularly useful for real data where the noise level is difficult to estimate. m = filter type l m = the number of major coefficients retained γ j,k = the vector of wavelet coefficients of transformed type m γ j,k = the vector of the contracted wavelet coefficients ml

31 31 Signal Denoising with MDL method

32 32 Comparison of Sum of the Square of Residuals (Hetroscedastic noise - PARAFAC) SSR Model 1 SSR Model 2 Without noise 10.510.2 Noisy data Hetro. noise 10% 35.5239.26 Hetro. noise 20% 110.27126.17 Denoised data Hetro. noise 10% 35.5239.25 Hetro. noise 20% 110.26126.17 Each number × 10 5 Wavelet Denoising (MDL)

33 33 Comparison of explained variation (Hetroscedastic noise - PARAFAC) Var. Model 1 Var. Model 2 Without noise 99.9499.95 Noisy data Hetro. noise 10% 99.8099.81 Hetro. noise 20% 99.3999.39 Denoised data Hetro. noise 10% 99.8099.81 Hetro. noise 20% 99.3999.39 Wavelet Denoising (MDL)

34 34 Relative Errors of Predicted Concentrations for sample 4 ( Hetroscedastic noise - PARAFAC) Analyte 1 Analyte 2 Analyte 3 0 % noise -0.75-7.39.4 Noisy data 10 % noise -0.85-7.39.3 20 % noise -0.94-7.09.3 Denoised data 10 % noise -0.85-7.289.3 20 % noise -0.94-7.019.3 Wavelet Denoising (MDL)

35 35 Relative Errors of Predicted Concentrations for Sample 5 ( Hetroscedastic noise - PARAFAC ) Analyte 1 Analyte 2 Analyte 3 0 % noise -0.5-6.114.3 Noisy data 10 % noise -0.5-6.114.2 20 % noise -0.4-6.214.1 Denoised data 10 % noise -0.5-6.114.2 20 % noise -0.4-6.214.1 Wavelet Denoising (MDL)

36 36 Prediction of Analyte Concentrations for Samples 4 & 5 using N-PLS

37 37 Relative Errors of Predicted Concentrations for Sample 4 ( Homoscedastic noise – NPLS model) Analyte 1 Analyte 2 Analyte 3 0 % noise -2.88-0.444.87 Noisy data 2 % noise -2.91-0.394.91 5 % noise -2.97-0.334.95 10 % noise -3.07-0.224.99 Denoised data 2 % noise -2.91-0.444.92 5 % noise -3.00-0.204.67 10 % noise -3.290.364.36 X-block > 99 Y-block > 99

38 38 Relative Errors of Predicted Concentrations for Sample 5 ( Homoscedastic noise – NPLS model ) Analyte 1 Analyte 2 Analyte 3 0 % noise -1.670.4411.24 Noisy data 2 % noise -1.650.3511.38 5 % noise -1.620.2211.57 10 % noise -1.57-0.0111.86 Denoised data 2 % noise -1.690.3111.34 5 % noise -2.080.2711.56 10 % noise -2.890.9310.66 X-block > 99 Y-block > 99

39 39 Analyte 1 Analyte 2 Analyte 3 0 % noise -2.88-0.444.87 Noisy data 10 % noise -3.00-0.284.89 20 % noise -3.12-0.124.91 Denoised data 10 % noise -3.00-0.284.91 20 % noise -3.12-0.124.91 Relative Errors of Predicted Concentrations for Sample 4 ( Hetroscedastic noise – NPLS model) Wavelet Denoising (MDL) X-block > 99 Y-block > 99

40 40 Comparison of Sum of the Square of Residuals (Hetroscedastic noise - PARAFAC) SSR Model 1 SSR Model 2 Without noise 10.510.2 Noisy data Hetro. noise 10% 35.5239.26 Hetro. noise 20% 110.27126.17 Denoised data Hetro. noise 10% 33.5236.56 Hetro. noise 20% 101.32113.78 Each number × 10 5

41 41 Analyte 1 Analyte 2 Analyte 3 0 % noise -2.88-0.444.87 Noisy data 10 % noise -3.00-0.284.89 20 % noise -3.12-0.124.91 Denoised data 10 % noise -3.00-0.364.88 20 % noise -3.12-0.235.78 Relative Errors of Predicted Concentrations for Sample 4 ( Hetroscedastic noise – NPLS model)

42 42 Analyte 1 Analyte 2 Analyte 3 0 % noise -1.670.4411.24 Noisy data 10 % noise -1.610.4311.23 20 % noise -1.560.4011.21 Denoised data 10 % noise -1.610.4211.19 20 % noise -1.560.3911.15 Relative Errors of Predicted Concentrations for Sample 5 ( Hetroscedastic noise - NPLS model )

43 43 Analyte 1 Analyte 2 Analyte 3 0 % noise -1.670.4411.24 Noisy data 10 % noise -1.610.4311.23 20 % noise -1.560.4011.21 Denoised data 10 % noise -1.610.4311.21 20 % noise -1.560.4011.21 Relative Errors of Predicted Concentrations for Sample 5 ( Hetroscedastic noise – NPLS model ) Wavelet Denoising (MDL)

44 44 X-Block Model 1 Y-Block X-Block Model 2 Y-Block noise 0% noise 0% 99.9499.8599.9399.94 Noisy data noise 2% noise 2% 99.8099.8599.7999.94 noise 5% noise 5% 99.0699.8599.0499.94 noise 10% noise 10% 96.5099.8496.5199.94 Denoised data noise 2% noise 2%99.8599.9799.9199.85 noise 5% noise 5%99.6799.9799.7799.85 noise 10% noise 10%99.1099.9799.3299.88 Comparison of explained variation (Homoscedastic noise – NPLS model)

45 45 Comparison of explained variation (Hetroscedastic noise – NPLS model) X-Block Model 1 Y-Block X-Block Model 2 Y-Block noise 0% noise 0% 99.9499.8599.9399.94 Noisy data noise 10% noise 10% 99.8099.8599.7899.94 noise 20% noise 20% 99.3899.8599.3499.94 Denoised data noise 10% noise 10%99.8199.8599.7999.94 noise 20% noise 20%99.4399.8699.3999.94

46 46 Comparison of explained variation (Hetroscedastic noise – NPLS model) X-Block Model 1 Y-Block X-Block Model 2 Y-Block noise 0% noise 0% 99.9499.8599.9399.94 Noisy data noise 10% noise 10% 99.8099.8599.7899.94 noise 20% noise 20% 99.3899.8599.3499.94 Denoised data noise 10% noise 10%99.8099.8599.7899.94 noise 20% noise 20%99.3899.8599.3499.94 Wavelet Denoising (MDL)


Download ppt "1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,"

Similar presentations


Ads by Google