Presentation is loading. Please wait.

Presentation is loading. Please wait.

MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations.

Similar presentations


Presentation on theme: "MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations."— Presentation transcript:

1 MANOVA Multivariate Analysis of Variance

2 One way Analysis of Variance (ANOVA) Comparing k Populations

3 The F test – for comparing k means Situation We have k normal populations Let  i and  denote the mean and standard deviation of population i. i = 1, 2, 3, … k. Note: we assume that the standard deviation for each population is the same.  1 =  2 = … =  k = 

4 We want to test against

5 The F statistic where x ij = the j th observation in the i th sample.

6 The ANOVA table SourceS.Sd.f,M.S.F Between Within The ANOVA table is a tool for displaying the computations for the F test. It is very important when the Between Sample variability is due to two or more factors

7 Computing Formulae: Compute 1) 2) 3) 4) 5)

8 The data Assume we have collected data from each of k populations Let x i1, x i2, x i3, … denote the n i observations from population i. i = 1, 2, 3, … k.

9 Then 1) 2) 3)

10 Sourced.f.Sum of Squares Mean Square F-ratio Betweenk - 1SS Between MS Between MS B /MS W WithinN - kSS Within MS Within TotalN - 1SS Total Anova Table

11 Example In the following example we are comparing weight gains resulting from the following six diets 1.Diet 1 - High Protein, Beef 2.Diet 2 - High Protein, Cereal 3.Diet 3 - High Protein, Pork 4.Diet 4 - Low protein, Beef 5.Diet 5 - Low protein, Cereal 6.Diet 6 - Low protein, Pork

12

13 Thus Thus since F > 2.386 we reject H 0

14 Sourced.f.Sum of Squares Mean Square F-ratio Between54612.933922.5874.3 ** (p = 0.0023) Within5411586.000214.556 Total5916198.933 Anova Table * - Significant at 0.05 (not 0.01) ** - Significant at 0.01

15 Equivalence of the F-test and the t-test when k = 2 the t-test

16 the F-test

17

18 Hence

19 Factorial Experiments Analysis of Variance

20 Dependent variable Y k Categorical independent variables A, B, C, … (the Factors) Let –a = the number of categories of A –b = the number of categories of B –c = the number of categories of C –etc.

21 The Completely Randomized Design We form the set of all treatment combinations – the set of all combinations of the k factors Total number of treatment combinations –t = abc…. In the completely randomized design n experimental units (test animals, test plots, etc. are randomly assigned to each treatment combination. –Total number of experimental units N = nt=nabc..

22 The treatment combinations can thought to be arranged in a k-dimensional rectangular block A 1 2 a B 12b

23 A B C

24 The Completely Randomized Design is called balanced If the number of observations per treatment combination is unequal the design is called unbalanced. (resulting mathematically more complex analysis and computations) If for some of the treatment combinations there are no observations the design is called incomplete. (In this case it may happen that some of the parameters - main effects and interactions - cannot be estimated.)

25 Example In this example we are examining the effect of We have n = 10 test animals randomly assigned to k = 6 diets The level of protein A (High or Low) and the source of protein B (Beef, Cereal, or Pork) on weight gains (grams) in rats.

26 The k = 6 diets are the 6 = 3×2 Level-Source combinations 1.High - Beef 2.High - Cereal 3.High - Pork 4.Low - Beef 5.Low - Cereal 6.Low - Pork

27 Table Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and s ource of protein (Beef, Cereal, or Pork) Level of ProteinHigh ProteinLow protein Source of ProteinBeefCerealPorkBeefCerealPork Diet123456 7398949010749 1027479769582 1185696909773 10411198648086 8195102869881 10788102517497 100821087274106 877791906770 11786120958961 11192105785882 Mean100.085.999.579.283.978.7 Std. Dev.15.1415.0210.9213.8915.7116.55

28 Source of Protein Level of Protein BeefCerealPork High Low Treatment combinations Diet 1Diet 2Diet 3 Diet 4Diet 5Diet 6

29 Level of ProteinBeefCerealPorkOverall Low79.2083.9078.7080.60 Source of Protein High100.0085.9099.5095.13 Overall89.6084.9089.1087.87 Summary Table of Means

30 Profiles of the response relative to a factor A graphical representation of the effect of a factor on a reponse variable (dependent variable)

31 Profile Y for A Y Levels of A a 123 … This could be for an individual case or averaged over a group of cases This could be for specific level of another factor or averaged levels of another factor

32 Profiles of Weight Gain for Source and Level of Protein

33

34 Example – Four factor experiment Four factors are studied for their effect on Y (luster of paint film). The four factors are: Two observations of film luster (Y) are taken for each treatment combination 1) Film Thickness - (1 or 2 mils) 2)Drying conditions (Regular or Special) 3)Length of wash (10,30,40 or 60 Minutes), and 4)Temperature of wash (92 ˚C or 100 ˚C)

35 The data is tabulated below: Regular DrySpecial Dry Minutes92  C100  C92  C100  C 1-mil Thickness 203.43.419.614.52.13.817.213.4 304.14.117.517.04.04.613.514.3 404.94.217.615.25.13.316.017.8 605.04.920.917.18.34.317.513.9 2-mil Thickness 205.53.726.629.54.54.525.622.5 305.76.131.630.25.95.929.229.8 405.55.630.530.25.55.832.627.4 607.26.031.429.68.09.933.529.5

36 Definition: A factor is said to not affect the response if the profile of the factor is horizontal for all combinations of levels of the other factors: No change in the response when you change the levels of the factor (true for all combinations of levels of the other factors) Otherwise the factor is said to affect the response:

37 Profile Y for A – A affects the response Y Levels of A a 123 … Levels of B

38 Profile Y for A – no affect on the response Y Levels of A a 123 … Levels of B

39 Definition: Two (or more) factors are said to interact if changes in the response when you change the level of one factor depend on the level(s) of the other factor(s). Profiles of the factor for different levels of the other factor(s) are not parallel Otherwise the factors are said to be additive. Profiles of the factor for different levels of the other factor(s) are parallel.

40 Interacting factors A and B Y Levels of A a 123 … Levels of B

41 Additive factors A and B Y Levels of A a 123 … Levels of B

42 If two (or more) factors interact each factor effects the response. If two (or more) factors are additive it still remains to be determined if the factors affect the response In factorial experiments we are interested in determining –which factors effect the response and – which groups of factors interact.

43 The testing in factorial experiments 1.Test first the higher order interactions. 2.If an interaction is present there is no need to test lower order interactions or main effects involving those factors. All factors in the interaction affect the response and they interact 3.The testing continues with for lower order interactions and main effects for factors which have not yet been determined to affect the response.

44 Models for factorial Experiments

45 The Single Factor Experiment Situation We have t = a treatment combinations Let  i and  denote the mean and standard deviation of observations from treatment i. i = 1, 2, 3, … a. Note: we assume that the standard deviation for each population is the same.  1 =  2 = … =  a = 

46 The data Assume we have collected data for each of the a treatments Let y i1, y i2, y i3, …, y in denote the n observations for treatment i. i = 1, 2, 3, … a.

47 The model Note: where has N(0,  2 ) distribution (overall mean effect) (Effect of Factor A) Note:by their definition.

48 Model 1: y ij (i = 1, …, a; j = 1, …, n) are independent Normal with mean  i and variance  2. Model 2: where  ij (i = 1, …, a; j = 1, …, n) are independent Normal with mean 0 and variance  2. Model 3: where  ij (i = 1, …, a; j = 1, …, n) are independent Normal with mean 0 and variance  2 and

49 The Two Factor Experiment Situation We have t = ab treatment combinations Let  ij and  denote the mean and standard deviation of observations from the treatment combination when A = i and B = j. i = 1, 2, 3, … a, j = 1, 2, 3, … b.

50 The data Assume we have collected data (n observations) for each of the t = ab treatment combinations. Let y ij1, y ij2, y ij3, …, y ijn denote the n observations for treatment combination - A = i, B = j. i = 1, 2, 3, … a, j = 1, 2, 3, … b.

51 The model Note: where has N(0,  2 ) distribution and

52 The model Note: where has N(0,  2 ) distribution Note:by their definition.

53 Model : where  ijk (i = 1, …, a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance  2 and Main effects Interaction Effect Mean Error

54 Maximum Likelihood Estimates where  ijk (i = 1, …, a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance  2 and

55 This is not an unbiased estimator of  2 (usually the case when estimating variance.) The unbiased estimator results when we divide by ab(n -1) instead of abn

56 The unbiased estimator of  2 is where

57 Testing for Interaction: where We want to test: H 0 : (  ) ij = 0 for all i and j, against H A : (  ) ij ≠ 0 for at least one i and j. The test statistic

58 We reject H 0 : (  ) ij = 0 for all i and j, If

59 Testing for the Main Effect of A: where We want to test: H 0 :  i = 0 for all i, against H A :  i ≠ 0 for at least one i. The test statistic

60 We reject H 0 :  i = 0 for all i, If

61 Testing for the Main Effect of B: where We want to test: H 0 :  j = 0 for all j, against H A :  j ≠ 0 for at least one j. The test statistic

62 We reject H 0 :  j = 0 for all j, If

63 The ANOVA Table SourceS.S.d.f.MS =SS/dfF ASS A a - 1MS A MS A / MS Error BSS B b - 1MS B MS B / MS Error ABSS AB (a - 1)(b - 1)MS AB MS AB / MS Error ErrorSS Error ab(n - 1)MS Error TotalSS Total abn - 1

64 Computing Formulae

65

66 MANOVA Multivariate Analysis of Variance

67 One way Multivariate Analysis of Variance (MANOVA) Comparing k p-variate Normal Populations

68 The F test – for comparing k means Situation We have k normal populations Let  denote the mean vector and covariance matrix of population i. i = 1, 2, 3, … k. Note: we assume that the covariance matrix for each population is the same.

69 We want to test against

70 The data Assume we have collected data from each of k populations Let denote the n observations from population i. i = 1, 2, 3, … k.

71 Computing Formulae: Compute 1) 2) 3)

72 4) 5)

73 Let = the Between SS and SP matrix

74 Let = the Within SS and SP matrix

75 SourceSS and SP matrix Between Within The Manova Table

76 There are several test statistics for testing against

77 1. Roy’s largest root This test statistic is derived using Roy’s union intersection principle 2. Wilk’s lambda (  ) This test statistic is derived using the generalized Likelihood ratio principle

78 3. Lawley-Hotelling trace statistic 4. Pillai trace statistic (V)

79 Example In the following study, n = 15 first year university students from three different School regions (A, B and C) who were each taking the following four courses (Math, biology, English and Sociology) were observed: The marks on these courses is tabulated on the following slide:

80 The data

81 Summary Statistics

82 Computations : 1) 2) 3)

83 4) =

84 = 5)

85 Now = the Between SS and SP matrix

86 Let = the Within SS and SP matrix

87 Using SPSS to perform MANOVA

88 Selecting the variables and the Factors

89 The output

90 Univariate Tests


Download ppt "MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations."

Similar presentations


Ads by Google