Analysis of Variance (ANOVA). Hypothesis H 0 :  i =  G H 1 :  i | (  i   G ) Logic S 2 within = error variability S 2 between = error variability.

Presentation on theme: "Analysis of Variance (ANOVA). Hypothesis H 0 :  i =  G H 1 :  i | (  i   G ) Logic S 2 within = error variability S 2 between = error variability."— Presentation transcript:

Analysis of Variance (ANOVA)

Hypothesis H 0 :  i =  G H 1 :  i | (  i   G ) Logic S 2 within = error variability S 2 between = error variability + treatment Where, x = grand mean, x j = group mean, k = number of groups, n j = number of participants in a group, N = Total number of participants

Classic ANOVA Table Total variability = Between variability + within variability Sums of square

Computation Degrees of freedomMeans square F

ANOVA

Analysis of Variance (ANOVA) From ANOVA If the independent variables are continuous and the dependant variable is also continuous then we will perform a multiple regression. Now, if the independent variables are discrete and the dependant variable is still continuous we will performed an ANOVA To GLMwhere, where,  = grand mean,  = treatment effect, e = error

Analysis of Variance (ANOVA) Example

Analysis of Variance (ANOVA) Using the GLM approach through a coding matrix Logic: If, for example, there are 3 groups and we know that the participant number 12 is not part of the first nor the second group, then we know that this participant is necessary part of group 3.

Analysis of Variance (ANOVA) Performing ANOVA using GLM through a coding matrix Logic: In other words, there is only 2 degrees of freedom in group assignation. Therefore, the third group column will be eliminated. A value of 1 will be assigned to the participants of group i and a value of 0 for the other groups. Whereas, the last group will receive a value of -1 for all the group (balance things).

Analysis of Variance (ANOVA) Then, for each subject we associate its corresponding group coding. = independent variables (X)

Analysis of Variance (ANOVA) X A = y SSCP =

R-Square R 2 is obtained by: Number of participantsNumber of predictors (independent variables) SSCP =

ANOVA Table The hypothesis is that the R-Square between the predictors and the criterion is null. Since F(3,32)=5.86938, p<0.05, we reject H 0 and we accept H 1. There is at least one group that is different from the others.

ANOVA Terminology Coefficient of determination (proportion of explained variation)

ANOVA Now you know it! ANOVA is a special case of regression. The same logic can be applied to t-test, factorial ANOVA, ANCOVA, simple effects (Tukey, Bonferoni, LSD, etc.)

Principal Component Analysis (PCA)

PCA Why To discover or to reduce the dimensionality of the data set. To identify new meaningful underlying variables Assumptions Sample size : about 300 (in general) Normality Linearity Absence of outliers among cases

PCA Illustration First principal component Second principal component

PCA X = Zx =Zx = Preliminary Data Z scores

PCA Preliminary SSCP Correlation Matrix SSCP = M =

PCA Eigenvalues and eigenvectors Let’s define a random vector as v(0) = [1, 1] T. Now, if we compute the inner product between the correlation matrix (M) and V(0) an re-multiply the result by M, again, again, and again, what the results will be after k iterations?

PCA Eigenvalues and eigenvectors Let’s define a random vector as v(0) = [1, 1] T. Now, if we compute the inner product between the correlation matrix (M) and V(0) an re-multiply the result by M, again, again, and again, what the results will be after k iterations?

PCA Eigenvalues and eigenvectors After convergence 1- The direction of the stable vector = Eigenvector (  ) 2- The stable vector lengthening factor = Eigenvalue ( )

PCA Eigenvalues and eigenvectors Once the first eigenvector (and associated eigenvalue) has been identified, we remove it from the matrix. And we repeat the process until all the eigenvectors and eigenvalues have been extracted.

PCA Eigenvalues and eigenvectors There will be as many eigenvectors/eigenvalues as there are variables. Each eigenvector will be orthogonal to the others. M = PCA

PCA Eigenvalues and eigenvectors There will be as many eigenvectors/eigenvalues as there are variables. Each eigenvector will be orthogonal to the others. 11 22 33 44

PCA Eigenvalues and eigenvectors How many are important? Plot the eigenvalues, 1- If the points on the graph tend to level out (show an "elbow"), these eigenvalues are usually close enough to zero that they can be ignored. 2- Limit variance accounted for, (e.g. 90%) Method 1 Method 2

PCA Eigenvalues and eigenvectors Illustration of the data and the selected eigenvectors 11 22 x1 x3 x2 x4 = F T

PCA VARIMAX Rotation Why ? To improve the readability The VARIMAX rotation aims at finding a solution where an original variable loads highly on one particular factor and loads as low as possible on other factors.  Rotation matrix

PCA VARIMAX Rotation The algorithm maximizes the VARIMAX index, V, the sum of the variances of the component-loading matrix. (X-X) 2 V will be a long equation that contains the  variable. An optimization technique is then used to find the value  that maximize V.

PCA VARIMAX Rotation The algorithm maximizes the VARIMAX index, V, the sum of the variances of the component-loading matrix. U = V =

PCA VARIMAX Rotation The algorithm maximizes the VARIMAX index, V, the sum of the variances of the component-loading matrix. 11 22 x1 x3 x2 x4 x1 x3 x2 x4 22 11

Download ppt "Analysis of Variance (ANOVA). Hypothesis H 0 :  i =  G H 1 :  i | (  i   G ) Logic S 2 within = error variability S 2 between = error variability."

Similar presentations