# بنام خدا 1. An Introduction to multi-way analysis Mohsen Kompany-Zareh IASBS, Nov 1-3, 2010 2 Session one.

## Presentation on theme: "بنام خدا 1. An Introduction to multi-way analysis Mohsen Kompany-Zareh IASBS, Nov 1-3, 2010 2 Session one."— Presentation transcript:

بنام خدا 1

An Introduction to multi-way analysis Mohsen Kompany-Zareh IASBS, Nov 1-3, 2010 2 Session one

3 The main source:

4  Kronecker product  Khatri-Rao product  Multi-way data  Matricizing the data  Interaction triad GG  PARAFAC  Panel performance  Matricizing and subarray  Rank  Dimensionality vector  Rank-deficiency in three-way arrays  Tucker3 rotational freedom  Unique solution  Tucker2 model  Tucker1 model

>> A=[2 3 4; 2 3 4] >> B=[3 4; 3 5] >> krnAB=[A(1,1)*B A(1,2)*B A(1,3)*B ; A(2,1)*B A(2,2)*B A(2,3)*B] krnAB = 6 8 9 12 12 16 6 10 9 15 12 20 6 8 9 12 12 16 6 10 9 15 12 20 >> kronecker product (A  B) 5

>> A=[2 3 4; 2 3 4] >>B=[3 4; 3 5] >> p=kron(A,B) >>p= 6 8 9 12 12 16 6 10 9 15 12 20 6 8 9 12 12 16 6 10 9 15 12 20 >> 6 All columns in A see all columns in B. kronecker product

>> A=[2 3 4; 2 3 4] >>C=[3 4 5; 3 5 2] >>krnAC=[kron(A(:, ),C(:, ))... column 1 kron(A(:,1),C(:,2))... column 2 kron(A(:,1),C(:,3))..... kron(A(:,2),C(:,1))..... kron(A(:, ),C(:, ))..... kron(A(:,2),C(:,3))... kron(A(:,3),C(:,1))... kron(A(:,3),C(:,2))... kron(A(:, ),C(:, ))] column 9 krnAC = 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 >> Khatri-Rao Product kronecker product 7 1 1 22 3 3

>> A=[2 3 4; 2 3 4] >>C=[3 4 5; 3 5 2] krnAC = 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 6 8 10 9 12 15 12 16 20 6 10 4 9 15 6 12 20 8 kronecker product 8 vec(a1  b1) vec(a2  b2)vec(a3  b3) vec(a1  b2) vec(a1  b3) vec(a2  b3) vec(a2  b1) vec(a3  b1) vec(a3  b2) Interaction terms

>> A=[2 3 4; 2 3 4] >> B=[3 4 5; 3 5 2] khtrAB= 6 12 20 6 15 8 6 12 20 6 15 8 >> 9 No of columns in A should be the same as the number of columns in B. Khatri-Rao Product

10  Kronecker product  Khatri-Rao product  Multi-way data  Matricizing the data  Interaction triad GG  PARAFAC  Panel performance  Matricizing and subarray  Rank  Dimensionality vector  Rank-deficiency in three-way arrays  Tucker3 rotational freedom  Unique solution  Tucker2 model  Tucker1 model

(generalization of matrix algebra) A zero-order tensor: a scalar; a first-order tensor : a vector; a second-order tensor (a matrix) for a sample => 3 way data, for analysis a third-order tensor (three-way array) for a sample => 4 way data, for analysis a fourth-order tensor : a four-way array and so on. 11 Multi-way Data

12 One component, HPLC-DAD a1  b1

13 One component, HPLC-DAD, different concentrations (elution profile) Only the intensities are changed... These 9 matrices form a TRIAD, the simplest trilinear data

14 >> a1' 0.0033 0.0971 0.8131 1.9506 1.3406 0.2640 0.0149 >> b1' 0.0222 1.7650 0.4060 0.8826 0.0111 0.0000 >> c1' 1 2 3 4 5 6 7 8 9 10 11 12 A triad : X A cube of data 12x7x7 3 rd order data for one sample Obtained from Tensor product of 3 vectors a1  b1  c1

% A triad by outer product % X111=a1  b1  c1... for l=1:length(a1) for m=1:length(b1) for n=1:length(c1) disp([l m n]) Xtriad(l,m,n)=a1(l)*b1(m)*c1(n); end X=Xtriad;.... a1 b1 c1

Matricizing the data 16 X111= Unfold3D(X111, 1) (in three directions) The first chemical component

17...and for the 2 nd and the next chemical components: X111 = a1  b1  c1 X222 a2 b2 c2 X222 = a2  b2  c2 X333 = a3  b3  c3 Each component in a separate triad (no interaction) + + X = X111 + X222 + X333 Trilinear PARAFAC

18 X111 = a1  b1  c1 X222 a2 b2 c2 X222 = a2  b2  c2 2b2 X121 = a1  b2  c1 + + X = X111 + X222 + X121 Non Trilinear!! Tucker In the presence of Interaction : Interaction triad

19 How many interaction triads? For two components in three modes: X111 = a1  b1  c1 X112 = a1  b1  c2 X121 = a1  b2  c1 X122 = a1  b2  c2 X211 = a2  b1  c1 X212 = a2  b1  c2 X221 = a2  b2  c1 X222 = a2  b2  c2 G(111)= 2 G(112)= 0 G(121)= 1 G(122)= 0 G(211)= 0 G(212)= 0 G(221)= 0 G(222)=-3 6 possible interaction triads1 interaction triads G

A(11x2) G(2x2x2) C(3x2) B(100  2) G(111)= 2 G(222)=-3 G(121)= 1

21 For three components in three modes: (3  3  3) – 3 = 24 possible interactions

A(15x4) G(?x?x?) C(20x2) B(100  3) How many G elements?

23 % Tucker3 outer product G=rand(4,3,2); for p=1:size(G,1) for q=1:size(G,2) for r=1:size(G,3) for i=1:size(A,2) for k=1:size(C,2) for m=1:size(B,2) disp([p q r i j k]) Xtriad(l,m,n)=A(i,l)*B(j,m)*C(k,n)*G(i,j,k); end X=X+Xtriad; end One triad

25

26 % PARAFAC outer product G=zeros(3,3,3); G(1,1,1)=1;G(2,2,2)=1;G(3,3,3)=1; for p=1:size(G,1) for q=1:size(G,2) for r=1:size(G,3) for i=1:size(A,2) for k=1:size(C,2) for m=1:size(B,2) disp([p q r i j k]) Xtriad(l,m,n)=A(i,l)*B(j,m)*C(k,n)*G(i,j,k); end X=X+Xtriad; end One triad

A(15x3) C(20x3) B(100  3) PARAFAC Simple interpretation

Monitoring panel performance within and between experiments by multi-way models Rosaria Romano and Mohsen Kompany-Zareh Copenhagen Univ, 2007

Organic Milk of high Quality Sensory studies 2007- University of Copenhagen - Spring experiment (May, week 21 & 22) - Autumn experiment (September, week 36 & 37) Two different experiments were conducted in 2007: The objective is to establish knowledge about production of high quality organic milk with a composition and flavour different from conventionally produced milk.

Spring experiment data Data description: 7 varieties of milk with respect to: - 2 cow races: Holstein-Fries (HF), Jersey (JE); - 7 farms: WB, EMC, UGJ, JP, HM, OA, KI. panel: - 9 assessors, 2 sessions (focus on the second!), 3 replicates for each session. 12 descriptors: odor (green), appearance (yellow), flavor (creamy, boiled-milk, sweet, bitter, metallic, sourness, stald-feed) after taste (astringent0, fatness, astringent20). measurement scale: continuous scale anchored at 0 and 15.

Parafac on the spring experiment(1) Model: Parafac with two components (27.9% ExpVar), on data averaged across the samples mode HF JE  high reproducibility of the replicates in both groups;  big variation in the JE group: - WB is the less yellow JE milk; - UGJ seems have something in common with HF group.

Parafac on the spring experiment(2) Model: Parafac with two components (27.9% ExpVar), on data averaged across the samples mode Best Reliability on Multi-way Assessment (Bro and Romano, 2008)

33  Kronecker product  Khatri-Rao product  Multi-way data  Matricizing the data  Interaction triad GG  PARAFAC  Panel performance  Matricizing and subarray  Rank  Dimensionality vector  Rank-deficiency in three-way arrays  Tucker3 rotational freedom  Unique solution  Tucker2 model  Tucker1 model

A has full rank (if and only if ) : r(A) = min(I,J). If r(A )= R, [Schott 1997]  A = t 1 p 1 + ·· ·+t R p R R rank one matrices (t r p r, components). 34 Bases are not unique: rotational freedom intensity intensity (or scale) indeterminacy. sign indeterm sign indeterminacy. Rank

If X (I × J ) : generated with I × J random numbers =>probability of (X has less than full rank) =0.. => measured data sets in chemistry: always full rank (mathematical rank) <= measurment noise Ex: UV spectra (100 wavelengths) ; ten different samples, each: same absorbing species at different concentrations.  X (10 ×100) if Lambert–Beer law holds : rank one. 35 mathem rank = ten + measurement errors => mathem rank = ten.

X = cs’ + E = X hat + E (model of X) vector c : concns, s : pure UV spectrum of the abs species E : noise part. systematic varNoise 1. systematic variation 2. Noise (undesirable) X hat  pseudo-rank =Math rank (X hat ) = one X < math rank (X). ‘chemical rank’ : number of chemical sources of variation in data. 36

Rank deficiency  pseudo-rank < chemical rank. ( linear relations in or restrictions on the data). Ex; X = c 1 s 1 + c 2 s 2 + c 3 s 3 + E, s 1 = s 2 (linear relation) => X = (c 1 + c 2 )s 1 + c 3 s 3 + E Chem rank (X)= 3 pseudo-rank (X)= 2, rank deficient 37

38

A randomly generated 2 × 2 × 2 array to have a rank lower than three : a positive probability [Kruskal 1989]. a probability of 0.79 of obtaining a rank two array a probability of 0.21 of obtaining a rank three. probability of obtaining rank one or lower is zero. generalized to : 2 × n × n arrays [Ten Berge 1991]. 39

2 × 2 × 2 array: the maximum rank: three typical rank: {2, 3}, (almost all individual rank: very hard to establish. Three way rank : important in second-order calibration and curve resolution. for degrees of freedom ?? for significance testing. 40

X(4 × 3 × 2) Boldfaces : in the foremost frontal slice 41 Matricizing and Sub-arrays Matricizing

42 sub-arrays

Row-rank, column-rank, tube-rank two-way X : rank(X) = rank(X’) column rank= row rank :not hold for three-way arrays. three-way array X(I × J × K) : matricized in three different ways (i) row-wise, giving X(J ×IK), a two-way array (ii) column-wise, giving X(I×JK), (iii) tube-wise, giving X(K×IJ). and three more with the same ranks,not mentioned ranks of the arrays X(J×IK),X(I×JK) and X(K×IJ), = (P, Q, R): dimensionality vector of X. 43 Dimensionality vector

44 P, Q and R: not necessarily equal. In contrast with two-way P = Q = r(X). dimensionality vector (P, Q, R) of a three-way array X with rank S Obeys certain inequalities [Kruskal 1989]: (i) P ≤ QR ; Q ≤ PR; R ≤ PQ (ii) max(P, Q, R) ≤ S ≤ min(PQ, QR, PR)

These arrays have rank 4, 3, and 2. Dimensionality vector is [4 3 2] P, Q and R can be unequal. 45 Three matricized forms:

Pseudo-rank, rank deficiency and chemical sources of variation pseudo-rank of three-way arrays: straight generalization of the two-way definit. X = X hat + E E : array of residuals. pseudo-rank of X = minimum # PARAFAC components necessary to exactly fit X hat. 46

Spectrophometric acid-base titration of mixtures of three weak mono-protic acids (or Flow injection analysis + pH gradient) HA2  H + + A2 - A3 HA3  H + + A3 - HA4 HA4  H + + A4 - six components models of separate titration of the three analytes (HA2, HA3, HA4), X HA2 = c a,2 s a,2 + c b,2 s b,2 + E HA2 X HA3 = c a,3 s a,3 + c b,3 s b,3 + E HA3 X HA4 = c a,4 s a,4 + c b,4 s b,4 + E HA4 10 samples, 15 titn points, and 20 wavel.s => X(10×15×20), 47 Rank-deficiency in three-way arrays

X = X hat + E c a,2 + c b,2 = α(c a,3 + c b,3 ) = β(c a,4 + c b,4 )  only four independently varying concn profiles. Pseudo-rank (X(IJ  K)) = four. pseudo-rank (X(3 × JK)) =three. six different ultraviolet spectra form, pseudo-rank (X(6 × KI)) =six ==>> a Tucker3 (6,4,3) model is needed to fit X. 48

49 3  6  4 = 72 nonzero elements !! Inequality laws: (i) P ≤ QR ; Q ≤ PR; R ≤ PQ (ii)max(3, 6, 4) ≤ S ≤ min(PQ, QR, PR) 6 ≤ S ≤ 12

50 three-way rank of X is ≥ 6 (six PARAFAC components fit the data) Pseudo rank (S=6) is not less than chemical rank(6) => no three-way rank deficiency. rank deficiencies in one loading matrix of a three-way array are not the same as a three-way rank deficiency.

51 How it is possible to have a rank deficient three-way data?

52  Kronecker product  Khatri-Rao product  Multi-way data  Matricizing the data  Interaction triad GG  PARAFAC  Panel performance  Matricizing and subarray  Rank  Dimensionality vector  Rank-deficiency in three-way arrays  Tucker3 rotational freedom  Unique solution  Tucker2 model  Tucker1 model

Tucker component models Ledyard Tucker was one of the pioneers in multi-way analysis. He proposed a series of models nowadays called N-mode PCA or Tucker models [Tucker 1964- 1966] 53

54 TUCKER3 MODELS : nonzero off-diagonal elements in its core.

In Kronecker product notation the Tucker3 model 55

PROPERTIES OF THE TUCKER3 MODEL T A : arbitrary nonsingular matrix Such a transformation of the loading matrix A can be defined similarly for B and C, using T B and T C, respectively 56 Tucker3 rotational freedom

Tucker3 model has rotational freedom, But: it is not possible to rotate Tucker3 core-array to a superdiagonal form (and to obtain a PARAFAC model.! 57 The Tucker3 model : not give unique component matrices  it has rotational freedom.

rotational freedom Orthogonal component matrices (at no cost in fit by defining proper matrices T A, T B and T C ) convenient : to make the component matrices orthogonal  easy interpretation of the elements of the core- array and of the loadings by the loading plots 58

59 SS of elements of core-array amount of variation explained by combination of factors in different modes. variation in X: unexplained and explained by model Using a proper rotation all the variance of explained part can be gathered in core.

60 The rotational freedom of Tucker3 models can also be used to rotate the core-array to a simple structure as is also common in two-way analysis (will be explained).

Imposing the restrictions A’A = B’B = C’C = I : not sufficient for obtaining a unique solution To obtain uniqe estimates of parameters, 1. loading matrices should be orthogonal, 2. A should also contain eigenvectors of X(CC’ ⊗ BB’)X’ corresp. to decreasing eigenvalues of that same matrix; similar restrictions should be put on B and C [De Lathauwer 1997, Kroonenberg et al. 1989]. 61 Unique solution

62 Unique Tucker Simulated data: Two components, PARAFAC model

63 Unique Tucker3 component model P=Q=R=3 Only two significant elements in core

64 Not exactly unique!

65 Not exactly unique! But very similar

66  Kronecker product  Khatri-Rao product  Multi-way data  Matricizing the data  Interaction triad GG  PARAFAC  Panel performance  Matricizing and subarray  Rank  Dimensionality vector  Rank-deficiency in three-way arrays  Tucker3 rotational freedom  Unique solution  Tucker2 model  Tucker1 model

all three modes are reduced In tucker 3 67

68 Data reduction only in two dimensions... Tucker2 model

Tucker1 models : reduce only one of the modes. + X (and accordingly G) are matricized : 69 Tucker1 model

70 different models [Kiers 1991, Smilde 1997]. Threeway component models for X (I × J × K), A : the (I × P) component matrix (of first (reduced) mode, X(I×JK) : matricized X; A,B,C : component matrices; G : different matricized core-arrays ; I :superdiagonal array (ones on superdiagonal. (compon matrices, core-arrays and residual error arrays : differ for each model => PARAFAC model is a special case of Tucker3 model. PARAFAC: X (IxJK) = A G (RxRR) (C  B)’ Tucker3: X (IxJK) = A G (PxQR) (C  B)’ Tucker2: X (IxJK) = A G (PxQK) (I  B)’ Tucker1: X (IxJK) = A G (PxJK) (I  I)’

71 Thanks and See you in the next session...

Download ppt "بنام خدا 1. An Introduction to multi-way analysis Mohsen Kompany-Zareh IASBS, Nov 1-3, 2010 2 Session one."

Similar presentations