Download presentation

Presentation is loading. Please wait.

Published byGabriel Radley Modified about 1 year ago

1
1 Experimental Statistics - week 5 Chapters 8, 9: Miscellaneous topics Chapter 14: Experimental design concepts Chapter 15: Randomized Complete Block Design (15.3)

2
2 1-Factor ANOVA Model y ij = i ij y ij = i ij or unexplained part mean for i th treatment observed data

3
3 were rewritten as:

4
4 In words: TSS (total SS) = total sample variability among y ij values SSB (SS “between”) = variability explained by differences in group means SSW (SS “within”) = unexplained variability (within groups)

5
5 Analysis of Variance Table Note: unequal sample sizes allowed

6
6 Extracted from From Ex. 8.2, page 390-391 3 Methods for Reducing Hostility 12 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method 1 96 79 91 85 Method 2 77 76 74 73 Method 3 66 73 69 66 Test:

7
7 ANOVA Table Output – extracted hostility data - calculations done in class Source SS df MS F p-value Between 767.17 2 383.58 16.7 <.001 samples Within 205.74 9 22.86 samples Totals 972.91

8
Protected LSD: Preceded by an F-test for overall significance. Unprotected: Not preceded by an F-test (like individual t-tests). Only use the LSD if F is significant. Fisher’s Least Significant Difference (LSD) X

9
9 Hostility Data - Completely Randomized Design The GLM Procedure t Tests (LSD) for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 22.86111 Critical Value of t 2.26216 Least Significant Difference 7.6482 Means with the same letter are not significantly different. t Grouping Mean N method A 87.750 4 1 B 75.000 4 2 B B 68.500 4 3

10
10 Ex. 8.2, page 390-391 3 Methods for Reducing Hostility 24 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method 1 96 79 91 85 83 91 82 87 Method 2 77 76 74 73 78 71 80 Method 3 66 73 69 66 77 73 71 70 74 Test: Notice unequal sample sizes

11
11 ANOVA Table Output – full hostility data Source SS df MS F p-value Between 1090.6 2 545.3 29.57 <.0001 samples Within 387.2 21 18.4 samples Totals 1477.8 23

12
12 The GLM Procedure t Tests (LSD) for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 21 Error Mean Square 18.43878 Critical Value of t 2.07961 Comparisons significant at the 0.05 level are indicated by ***. Difference method Between 95% Confidence Comparison Means Limits 1 - 2 11.179 6.557 15.800 *** 1 - 3 15.750 11.411 20.089 *** 2 - 1 -11.179 -15.800 -6.557 *** 2 - 3 4.571 0.071 9.072 *** 3 - 1 -15.750 -20.089 -11.411 *** 3 - 2 -4.571 -9.072 -0.071 *** Notice the different format since there is not one LSD value with which to make all pairwise comparisons.

13
13 Duncan's Multiple Range Test for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 21 Error Mean Square 18.43878 Harmonic Mean of Cell Sizes 7.91623 NOTE: Cell sizes are not equal. Number of Means 2 3 Critical Range 4.489 4.712 Means with the same letter are not significantly different. Duncan Grouping Mean N method A 86.750 8 1 B 75.571 7 2 C 71.000 9 3 Note: Duncan’s test (another multiple comparison test) avoids the issue of different sample sizes by using the harmonic mean of the n i ’ s.

14
14 Some Multiple Comparison Techniques in SAS FISHER’S LSD (LSD) BONFERONNI (BON) DUNCAN STUDENT-NEWMAN-KEULS (SNK) DUNNETT RYAN-EINOT-GABRIEL-WELCH (REGWQ) SCHEFFE TUKEY

15
15 1122.4 2324.6 3120.3 4419.8 5324.3 6222.2 7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 13417.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9 26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 32320.3 Balloon Data Col. 1-2 - observation number Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col. 4-7 - inflation time in seconds

16
16 1122.4 2324.6 3120.3 4419.8 5324.3 6222.2 7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 13417.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9 26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 32320.3 Balloon Data Col. 1-2 - observation number Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col. 4-7 - inflation time in seconds

17
17 ANOVA --- Balloon Data General Linear Models Procedure Dependent Variable: TIME Sum of Mean Source DF Squares Square F Value Pr > F Model 3 126.15125000 42.05041667 3.85 0.0200 Error 28 305.64750000 10.91598214 Corrected Total 31 431.79875000 R-Square C.V. Root MSE TIME Mean 0.292153 16.31069 3.3039343 20.256250 Mean Source DF Type I SS Square F Value Pr > F Color 3 126.15125000 42.05041667 3.85 0.0200

18
18 ANOVA --- Balloon Data The GLM Procedure t Tests (LSD) for time NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 28 Error Mean Square 10.91598 Critical Value of t 2.04841 Least Significant Difference 3.3839 Means with the same letter are not significantly different. t Grouping Mean N color A 22.575 8 2 A A 21.875 8 3 B 18.388 8 1 B B 18.188 8 4

19
19 Experimental Design: Concepts and Terminology Designed Experiment - an investigation in which a specified framework is used to compare groups or treatments Factors - up to this point we’ve only looked at experiments with a single factor - any feature of the experiment that can be varied from trial to trial

20
20 Experimental Units - subjects, material, etc. to which treatment factors are randomly assigned - there is inherent variability among these units irrespective of the treatment imposed Replication - we usually assign each treatment to several experimental units - these are called replicates - conditions constructed from the factors (levels of the factor considered, etc.) Treatments

21
21 Examples: Car Data Hostility Data Balloon Data 1. factor 2. treatments 3. experimental units 4. replicates

22
22 1122.4 2324.6 3120.3 4419.8 5324.3 6222.2 7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 13417.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9 26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 32320.3 Balloon Data Col. 1-2 - observation number (run order) Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col. 4-7 - inflation time in seconds Why randomize run order? i.e. why not blow up all the pink balloons first, blue balloons next, etc? Question:

23
23 Scatterplot Using GPLOT What do we learn from this plot? Run Order Time

24
24 RECALL: 1-Factor ANOVA Model - random errors follow a Normal ( N) distribution, are independently distributed ( ID ), and have zero mean and constant variance -- i.e. variability does not change from group to group

25
25 Model Assumptions: Checking Validity of Assumptions 1. F-test similar to 2-sample case - Hartley’s test (p.366 text) - not recommended 2. Graphical - side-by-side box plots - equal variances - normality Equal Variances

26
26 Graphical Assessment of Equal Variance Assumption

27
27 Note: Optional approaches if equal variance assumption is violated: 1. Use Kruskal Wallis nonparametric procedure – Section 8.6 2. Transform the data to induce more nearly equal variances – Section 8.5 -- log -- square root Note: These transformations may also help induce normality

28
28 y ij = i ij Assessing Normality of Errors ij = y ij ( i ) so ij is estimated by = y ij i The e ij ’ s are called residuals.

29
29 proc glm; class color; model time=color; title 'ANOVA --- Balloon Data'; output out=new r=resball; means color/lsd; run; proc sort; by color; run; proc boxplot; plot time*color; title 'Side-by-Side Box Plots for Balloon Data'; run; proc univariate; var resball; histogram resball/normal; title 'Histogram of Residuals -- Balloon Data'; run; proc univariate normal plot; var resball; title 'Normal Probability Plot for Residuals - Balloon Data'; run; proc gplot; plot time*id; title 'Scatterplot of Time vs ID (Run Order)'; run; SAS Code for Balloon Data

30
30 Normal Probability Plot 6.5+ +*+ | * *+++ | *+++ | +*+ | *** | **** 0.5+ ***+ | ++** | ++*** | ***** | +*+ | *+*+* -5.5+ * ++++ +----+----+----+----+----+----+----+----+----+--- -+ -2 -1 0 +1 +2

31
31 Caution: Chapter 15 introduces some new notation - i.e. changes notation already defined

32
32 Recall: Sum-of-Squares Identity 1-Factor ANOVA In words: T otal SS = SS between samples + within sample SS

33
33 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15

34
34 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15

35
35 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15 In words: T otal SS = SS for “treatments” + SS for “error”

36
36 Revised ANOVA Table for 1-Factor ANOVA (Ch. 15 terminology - p.857) Source SS df MS F Treatments SST t 1 Error SSE N t Total TSS N

37
37 Recall 1-factor ANOVA (CRD) Model for Gasoline Octane Data y ij = i ij y ij = i ij or unexplained part mean for i th gasoline observed octane -- car-to-car differences -- temperature -- etc.

38
38 Gasoline Octane Data Question: What if car differences are obscuring gasoline differences? Similar to diet t-test example: Recall: person-to-person differences obscured effect of diet

39
39 Possible Alternative Design for Octane Study: Test all 5 gasolines on the same car - in essence we test the gasoline effect directly and remove effect of car-to-car variation Question: How would you randomize an experiment with 4 cars?

40
40 Blocking an Experiment - dividing the observations into groups (called blocks) where the observations in each block are collected under relatively similar conditions - comparisons can many times be made more precisely this way

41
41 Terminology is based on Agricultural Experiments Consider the problem of testing fertilizers on a crop - t fertilizers - n observations on each

42
42 Completely Randomized Design A A B B C C B A C C B A A B C t = 3 fertilizers n = 5 replications - randomly select 15 plots - randomly assign fertilizers to the 15 plots

43
43 Randomized Complete Block Strategy B | A | C A | C | B C | A | B A | B | C C | B | A t = 3 fertilizers - select 5 “blocks” - randomly assign the 3 treatments to each block Note: The 3 “plots” within each block are similar - similar soil type, sun, water, etc

44
44 Randomized Complete Block Design Randomly assign each treatment once to every block Car Example Car 1: randomly assign each gas to this car Car 2:.... etc. Agricultural Example Randomly assign each fertilizer to one of the 3 plots within each block

45
45 y ij = i j ij Model For Randomized Complete Block (RCB) Design effect of i th treatment effect of j th block unexplained error (car)(gasoline) -- temperature -- etc.

46
46 Previous Data Table from Chapter 8 for 1-factor ANOVA column averages don’t make any sense

47
47 Back to Octane data: Suppose that instead of 20 cars, there were only 4 cars, and we tested each gasoline on each car. “Restructured” Data A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 Old Data Format 1 2 3 4 Car Gas A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 Gas

48
48 Recall: Sum-of-Squares Identity 1-Factor ANOVA - using new notation for Chapter 15 In words: T otal SS = SS for “treatments” + SS for “error”

49
49 A New Sum-of-Squares Identity In words: T otal SS = SS for treatments + SS for blocks + SS for error

50
50 Hypotheses: To test for treatment effects - i.e. gas differences we test To test for block effects - i.e. car differences (not usually the research hypothesis) we test

51
51 Randomized Complete Block Design ANOVA Table Source SS df MS F Treatments SST t 1 Blocks SSB Error SSE Total TSS bt See page 866

52
52 Test for Treatment Effects Note:

53
53 Test for Block Effects

54
54 The first variable (A - E) indicates gas as it did with the Completely Randomized Design. The second variable (B1 - B4) indicates car. A B1 91.7 A B2 91.2 A B3 90.9 A B4 90.6 B B1 91.7 B B2 91.9 B B3 90.9 B B4 90.9 C B1 92.4 C B2 91.2 C B3 91.6 C B4 91.0 D B1 91.8 D B2 92.2 D B3 92.0 D B4 91.4 E B1 93.1 E B2 92.9 E B3 92.4 E B4 92.4 “Restructured” CAR Data - SAS Format

55
55 SAS file - Randomized Complete Block Design for CAR Data INPUT gas$ block$ octane; PROC GLM; CLASS gas block; MODEL octane=gas block; TITLE 'Gasoline Example -Randomized Complete Block Design'; MEANS gas/LSD; RUN;

56
56 1-Factor ANOVA Table Output - octane data Source SS df MS F p-value Gas 6.108 4 1.527 6.80 0.0025 (treatments) Error 3.370 15 0.225 Totals 9.478 19

57
57 1-Factor ANOVA Table Output - car data Source SS df MS F p-value Gas 6.108 4 1.527 15.58 0.0001 (treatments) Cars 2.194 3 0.731 7.46 0.0044 (blocks) Error 1.176 12 0.098 Totals 9.478 19

58
58 Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model 7 8.30200000 1.18600000 12.10 0.0001 Error 12 1.17600000 0.09800000 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.875923 0.341347 0.3130495 91.710000 Source DF Anova SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 15.58 0.0001 BLOCK 3 2.19400000 0.73133333 7.46 0.0044 SAS Output -- RCB CAR Data

59
Multiple Comparisons in RCB Analysis

60
60 t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C B C B 91.3500 4 B C C 91.1000 4 A t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C C 91.3500 4 B C C 91.1000 4 A CAR Data -- LSD Results CRD Analysis RCB Analysis

61
61 Bon Grouping Mean N gas A 92.7000 4 E A B A 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A CAR Data -- Bonferroni Results CRD Analysis RCB Analysis Bon Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google