Presentation is loading. Please wait.

Presentation is loading. Please wait.

SJS SDI_91 Design of Statistical Investigations Stephen Senn 9 Unbalanced Designs.

Similar presentations


Presentation on theme: "SJS SDI_91 Design of Statistical Investigations Stephen Senn 9 Unbalanced Designs."— Presentation transcript:

1 SJS SDI_91 Design of Statistical Investigations Stephen Senn 9 Unbalanced Designs

2 SJS SDI_92 Lack of Orthogonality So far we have been considering balanced designs –for example every treatment appears equally frequently in every block Sometimes we do not have such balance –by accident missing observations –by design

3 SJS SDI_93 Consequences Some loss of efficiency –compared to some theoretical optimum CAUTION: this may not be obtainable in practice and may be why an unbalanced design has been chosen Complications in analysis –Sums of squares may depend on what other terms have been fitted so far only residual sum of squares has had this property

4 SJS SDI_94 Exp_11 Senn 2002 Example 5.1 Cross-over trial in asthma Comparison of salbutamol, formoterol, placebo Trial run in six sequences Unequal numbers of patients per sequence

5 SJS SDI_95 Exp_11 Sequences and Periods: Number of Observations I II III FSP 5 5 5 SPF 3 3 3 PFS 6 6 6 FPS 6 6 6 SFP 5 5 5 PSF 5 5 5 Patients by Sequence FSP SPF PFS FPS SFP PSF 5 3 6 6 5 5 Note that although there are no missing data due to patients not having completed a sequence, the numbers of patients are unbalanced by sequence

6 6

7 SJS SDI_97 Exp_11 Data 1 FSP 3500 3200 2900 10 FSP 3400 2800 2200 17 FSP 2300 2200 1700 21 FSP 2300 1300 1400 23 FSP 3000 2400 1800 4 SPF 2200 1100 2600 8 SPF 2800 2000 2800 16 SPF 2400 1700 3400 6 PFS 2200 2500 2400 9 PFS 2200 3200 3300 13 PFS 800 1400 1000 20 PFS 950 1320 1480 26 PFS 1700 2600 2400 31 PFS 1400 2500 2200 2 FPS 3100 1800 2400 11 FPS 2800 1600 2200 14 FPS 3100 1600 1400 19 FPS 2300 1500 2200 25 FPS 3000 1700 2600 28 FPS 3100 2100 2800 3 SFP 2100 3200 1000 12 SFP 1600 2300 1600 18 SFP 1600 1400 800 24 SFP 3100 3200 1000 27 SFP 2800 3100 2000 5 PSF 900 1900 2900 7 PSF 1500 2600 2000 15 PSF 1200 2200 2700 22 PSF 2400 2600 3800 30 PSF 1900 2700 2800

8 SJS SDI_98 Exp_11 Not Fitting Period > fit1 <- lm(fev1 ~ patient + treat) > summary(fit1, corr = F) Coefficients: Value Std. Error t value Pr(>|t|) …... treatS -424.6667 87.3127 -4.8637 0.0000 treatP -1099.0000 87.3127 -12.5869 0.0000 Residual standard error: 338.2 on 58 degrees of freedom Multiple R-Squared: 0.8569

9 SJS SDI_99 Exp_11 Fitting Period > fit2 <- update(fit1,. ~. + period) summary(fit2, corr = F) Call: lm(formula = fev1 ~ patient + treat + period) Coefficients: Value Std. Error t value Pr(>|t|) …... treatS -422.6220 88.2647 -4.7881 0.0000 treatP -1103.4638 87.8208 -12.5649 0.0000 periodII -109.7228 87.8208 -1.2494 0.2167 periodIII -42.7659 88.2647 -0.4845 0.6299 Residual standard error: 339.4 on 56 degrees of freedom Multiple R-Squared: 0.8608

10 SJS SDI_910 Exp_11 ANOVA > aov.1 <- aov(fev1 ~ patient + treat) > summary(aov.1) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.41677 1.065573e-009 treat 2 18428682 9214341 80.57832 0.000000e+000 Residuals 58 6632451 114353 > aov.2 <- aov(fev1 ~ patient + period) > summary(aov.2) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733774.9 1.703663 0.0424500 period 2 80282 40141.1 0.093199 0.9111486 Residuals 58 24980851 430704.3 >

11 SJS SDI_911 Exp_11 ANOVA > aov.3 <- aov(fev1 ~ patient + period + treat) >summary(aov.3) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 period 2 80282 40141 0.34853 0.7072422 treat 2 18531248 9265624 80.45067 0.0000000 Residuals 56 6449603 115171 > aov.4 <- aov(fev1 ~ patient + treat + period) > summary(aov.4) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 treat 2 18428682 9214341 80.00540 0.0000000 period 2 182848 91424 0.79381 0.4571415 Residuals 56 6449603 115171

12 SJS SDI_912 Exp_11 ANOVA > ssType3(aov.3) Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 period 2 182848 91424 0.79381 0.4571415 treat 2 18531248 9265624 80.45067 0.0000000 Residuals 56 6449603 115171 > ssType3(aov.4) Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 treat 2 18531248 9265624 80.45067 0.0000000 period 2 182848 91424 0.79381 0.4571415 Residuals 56 6449603 115171

13 SJS SDI_913 Exp_11 Standard Errors Period effect not fitted Period effect fitted

14 SJS SDI_914 Incomplete Blocks These designs arise when the number of treatments exceeds the number of units in a typical block Not possible to have every treatment in every block Each block receives a subset of the units These to be chosen in a sensible manner

15 SJS SDI_915 Exp_12 Senn 2002 Example 7.2 Placebo (P) controlled cross-over design to compare two doses of formoterol –F12 : 12 mg in a single puff –F24: 24 mg in a single puff Patients could only be treated in two periods Incomplete blocks design 24 Patients to be allocated in equal numbers to each of six sequences

16 SJS SDI_916 EXP_12 Sequences used P F12 F12 P P F24 F24 P F12 F24 F24 F12 The basic design is said to be that of balanced incomplete blocks. In this context balance has a special meaning: each pair of possible treatments appears equally often in every block Because this is a cross-over design and we are worried about period effects the design is also balanced by period (order) but that is another matter

17 SJS SDI_917 EXP_12 The sad reality Two incorrect packs were picked up. –One was for correct sequence –One was not Numbers of Observations Period Sequence 1 2 F12F24 3 3 F12P 5 5 F24F12 4 4 F24P 4 4 PF12 4 4 PF24 4 4 F12 F24 has one fewer patient F12 P has one more

18 SJS SDI_918 EXP_12 The Data 6 F12F24 2.500 2.450 10 F12F24 1.750 1.725 15 F12F24 1.370 1.120 4 F12P 3.400 2.500 11 F12P 2.250 1.925 14 F12P 1.460 1.260 21 F12P 1.480 0.880 23 F12P 2.050 2.100 2 F24F12 2.700 2.250 12 F24F12 0.900 0.925 13 F24F12 1.270 1.010 24 F24F12 2.150 2.100 3 F24P 1.750 1.350 7 F24P 2.525 2.150 18 F24P 1.080 0.840 22 F24P 3.120 2.310 5 PF12 2.500 3.500 9 PF12 1.600 2.650 16 PF12 1.750 2.190 19 PF12 0.640 0.840 1 PF24 2.100 3.100 8 PF24 2.300 2.700 17 PF24 1.030 1.870 20 PF24 0.810 0.940

19 SJS SDI_919

20 SJS SDI_920 Exp_12 Analysis 1 > fit1 <- lm(FEV1 ~ patient + period + treat) > summary(fit1, corr = F) Call: lm(formula = FEV1 ~ patient + period + treat)... Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 2.8164 0.1854 15.1874 0.0000 patient2 -0.3770 0.2350 -1.6042 0.1236... patient24 -0.7270 0.2350 -3.0933 0.0055 period 0.0310 0.0667 0.4652 0.6466 treatF24 0.0402 0.0973 0.4134 0.6835 treatP -0.5041 0.0914 -5.5148 0.0000

21 SJS SDI_921 Exp_12 Analysis 2 > aov1 <- aov(FEV1 ~ patient + period + treat) > summary(aov1) Df Sum of Sq Mean Sq F Value Pr(F) patient 23 22.46280 0.976643 18.37451 0.0000000 period 1 0.00083 0.000833 0.01568 0.9015459 treat 2 2.32792 1.163962 21.89871 0.0000073 Residuals 21 1.11619 0.053152 > ssType3(aov1) Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) patient 23 23.64324 1.027967 19.34011 0.0000000 period 1 0.01150 0.011501 0.21638 0.6466003 treat 2 2.32792 1.163962 21.89871 0.0000073 Residuals 21 1.11619 0.053152

22 SJS SDI_922 Exp_12 Analysis 3 > aov2 <- aov(FEV1 ~ patient + treat + period) > summary(aov2) Df Sum of Sq Mean Sq F Value Pr(F) patient 23 22.46280 0.976643 18.37451 0.0000000 treat 2 2.31726 1.158628 21.79836 0.0000075 period 1 0.01150 0.011501 0.21638 0.6466003 Residuals 21 1.11619 0.053152 > ssType3(aov2) Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) patient 23 23.64324 1.027967 19.34011 0.0000000 treat 2 2.32792 1.163962 21.89871 0.0000073 period 1 0.01150 0.011501 0.21638 0.6466003 Residuals 21 1.11619 0.053152

23 SJS SDI_923 Standard Errors Consider the standard error of the contrast F24 versus F12 This is given as 0.0973 How could this be calculated? There are two sequences in which these drugs could be compared –F12F24 with 3 patients –F24F12 with 4 patients

24 SJS SDI_924 However Thus the standard error we have from fitting the regression model is actually lower than that produced by a naïve argument.

25 SJS SDI_925 Questions Exp_12 Why is the SE produced by the regression analysis lower than that produced by using the pooled MSE and the direct comparison of the means? What would the treatment estimate be if this naïve approach was used? How does it compare to that produced? What further information is the regression approach taking into account?

26 SJS SDI_926 Block Size and Comparisons Suppose that the block size is k (there are k units per block) and that there are b blocks in total and bk units in total Suppose that we have v treatments and r replicates. There must also be rv units in total Hence rv = bk = N. Each block permits k(k-1)/2 comparisons. There are bk(k-1)/2 in total. However, there are v(v-1)/2 possible pair-wise comparisons.

27 SJS SDI_927 Block Size and Comparisons Let be the average number of repetitions of the pair-wise comparisons in the design. Hence Obviously unless this is an integer, it will not be possible to balance the blocks. If v-1 is a multiple of k-1 then it becomes particularly easy to balance the blocks

28 SJS SDI_928 Exp_13 It was desired to compare three doses each of two formulations of formoterol to placeo –ISF 6, ISF12, ISF24 –MTA6, MTA12,MTA24 –Placebo There are thus seven treatments Maximum number of acceptable periods was deemed to be five

29 SJS SDI_929 Exp_13 Possible solution Since 7-1 = 6 is twice 4-1 = 3 use design in 4 periods If seven sequences are used it will also be possible to make the treatments uniform on the periods There are (7 6)/2 = 21 possible pair-wise comparisons of treatments Each patient provides (4 3)/2 = 6 possible comparison There are 7 6 = 42 = 2 21 such comparisons per set of seven sequences

30 SJS SDI_930 A Balanced Design Uniform on the Periods for 7 treatments in 4 periods

31 SJS SDI_931 Questions Exp_13 Exp_13 was in fact run using five periods and 21 sequences Check that such a design can be balanced An alternative considered was to use five periods and seven sequences Show that such a design cannot be balanced Why might it be preferable to the design in four periods and seven sequences?


Download ppt "SJS SDI_91 Design of Statistical Investigations Stephen Senn 9 Unbalanced Designs."

Similar presentations


Ads by Google