Presentation on theme: "SJS SDI_81 Design of Statistical Investigations Stephen Senn 8 Factorial Designs."— Presentation transcript:
SJS SDI_81 Design of Statistical Investigations Stephen Senn 8 Factorial Designs
SJS SDI_82 Introduction So far we have been looking at complications with blocking structure However, we now introduce complications in treatment structure We now look at factorial designs These are designs in which there are two or more dimensions to the treatments
SJS SDI_83 Exp_8 (From Clarke and Kempson) The yield of a chemical reaction is presumed to depend on two things A: The amount (low or high) in the mixture of a certain chemical B: The presence or absence of a catalyst An experiment is run to determine the importance of these in affecting yield.
SJS SDI_85 Terminology A and B are factors low and high are levels of the factor A absence and presence are levels of the factor B An experiment studying combinations of factors is called a factorial experiment If all four combinations are studied, then this is a 2 x 2 or 2 2 factorial.
SJS SDI_86 Usual Notation for 2 2 Factorials A and B are the factors a and b are the higher levels ab = the combination of both factors at higher level a = A at higher level B at lower level b = A at lower level B at higher level (1) = both factors at lower level
SJS SDI_87 Main Effects and Interactions The Main Effect of a factor is the average response (averaged over all levels of the other factors) to a change in the level of that factor. Thus the main effect of A is the average of the difference between a and (1) and the difference between ab and b. The interaction between two factors A and B is the difference between the effect of A at the higher level of B (ab - b) and the difference at the lower level of B (a- (1)). Sometimes, by convention, this double difference is divided by 2.
SJS SDI_89 Exp_9 (Clarke and Kempson) Factor S: source of supply of a particular material –Two sources s when first is used Factor M: the speed of running a machine –Two speeds m whenever higher is used Experiment run on five days Response: Average number of defectives per batch
SJS SDI_810 Exp_9 (Clarke and Kempson) The days determine the block structure of the experiment The treatment structure is that of a 2 2 factorial –S M –(1), s, m, sm
SJS SDI_814 Treatment Structure The above analysis uses a one dimensional treatment structure –Single factor with four unordered levels We wish, however, to distinguish between constituent factors This can be done as follows
SJS SDI_818 Exp_9 SPlus: Treatment as Factor with 4 Levels fit1 <- aov(Y ~ Block + Treat) > summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.3350 0.8491633 Treat 3 1165.750 388.5833 795.0554 0.0000000 Residuals 12 5.865 0.4888
SJS SDI_819 Exp_9 SPlus: Two equivalent statements using two factors with interactions > fit2 <- aov(Y ~ Block + Supply * Machine) summary(fit2) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.335 0.8491633 Supply 1 966.050 966.0500 1976.573 0.0000000 Machine 1 198.450 198.4500 406.036 0.0000000 Supply:Machine 1 1.250 1.2500 2.558 0.1357512 Residuals 12 5.865 0.4888 > fit3 <- aov(Y ~ Block + Supply + Machine + Supply:Machine) > summary(fit3) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.335 0.8491633 Supply 1 966.050 966.0500 1976.573 0.0000000 Machine 1 198.450 198.4500 406.036 0.0000000 Supply:Machine 1 1.250 1.2500 2.558 0.1357512 Residuals 12 5.865 0.4888
SJS SDI_820 Exp_9 SPlus: Two equivalent statements using two factors without interactions > fit4 <- aov(Y ~ Block + Supply + Machine) > summary(fit4) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.299 0.8732822 Supply 1 966.050 966.0500 1765.095 0.0000000 Machine 1 198.450 198.4500 362.593 0.0000000 Residuals 13 7.115 0.5473 > fit5 <- aov(Y ~ Block + Supply * Machine - Supply:Machine) > summary(fit5) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.299 0.8732822 Supply 1 966.050 966.0500 1765.095 0.0000000 Machine 1 198.450 198.4500 362.593 0.0000000 Residuals 13 7.115 0.5473
SJS SDI_821 Wilkinson and Roger Notation This is a common notation A = main effect of factor A, B = main effect of factor B A:B = interaction of A and B, A:B:C = three factor interaction of A, B and C + sign used to add effects - used to subtract them A*B = A + B+ A:B = main effects of A and B and their interactions A*B*C = A + B + C +A:B + A:C + B:C + A:B:C NB In their original paper Applied Statistics,1973,22,392-399, W&R used instead of : as used in SPlus
SJS SDI_822 Exp_9 Design Notes The two factors and their interaction are orthogonal –consequence of treatment combinations chosen They are also orthogonal to the blocks –This is a consequence of how they were applied Each combination in each day of the week –This increases efficiency Effectively treatments are compared within blocks
SJS SDI_823 Exp_10 (Senn Example 7.1) Cross-over comparing two formulations at two doses –Solution and Suspension –12 g and 24 g per puff Four periods Four sequences in a Latin Square used 16 Patients allocated at random –4 to each sequence
SJS SDI_825 Exp_10 Design Notes Two dimensional block structure –16 Patients x 4 periods Treatment structure factorial –Formulations x doses Treatments allocated in way that is orthogonal to block structure Latin square (replicated 4 times) –Actually the patient changes
SJS SDI_827 Exp_10 SPlus Analysis #fit treat as a factor fit1<-aov(fev1~patient+period+treat) summary(fit1) model.tables(fit1, type="effects", se=T, cterms="treat") #use the factorial approach with dose and form fit2<-aov(fev1~patient+period+form*dose) summary(fit2) model.tables(fit2, type="effects", se=T, cterms=c("form","dose","form:dose"))
SJS SDI_828 SPlus Results 1 summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) patient 15 22.27234 1.484823 14.46822 0.0000000 period 3 0.08547 0.028490 0.27760 0.8412298 treat 3 0.36172 0.120573 1.17487 0.3307357 Residuals 42 4.31031 0.102626 Tables of effects treat sol12 sol24 sus12 sus24 0.00156 0.12031 -0.07969 -0.04219 Standard errors of effects treat 0.080088 replic. 16.000000
SJS SDI_829 SPlus Results 2 Df Sum of Sq Mean Sq F Value Pr(F) patient 15 22.27234 1.484823 14.46822 0.0000000 period 3 0.08547 0.028490 0.27760 0.8412298 form 1 0.23766 0.237656 2.31574 0.1355644 dose 1 0.09766 0.097656 0.95157 0.3349054 form:dose 1 0.02641 0.026406 0.25730 0.6146314 Residuals 42 4.31031 0.102626
SJS SDI_830 form sol sus 0.060938 -0.060938 dose 12 24 -0.039063 0.039063 form:dose Dim 1 : form Dim 2 : dose 12 24 sol -0.020313 0.020313 sus 0.020313 -0.020313 Standard errors of effects form dose form:dose 0.056631 0.056631 0.080088 replic. 32.000000 32.000000 16.000000
SJS SDI_831 Questions To what extent do you think that the model for analysis is appropriate? What sort of distribution might number of defectives have? How else might one analyse the data –If one knew the batch sizes? –If one did not? What further problems might there be? According to C&K in Exp_9 the response is mean faulty items per batch based on ten batches