Download presentation

Presentation is loading. Please wait.

Published byGabriel Radley Modified about 1 year ago

1
1 Experimental Statistics - week 5 Chapters 8, 9: Miscellaneous topics Chapter 14: Experimental design concepts Chapter 15: Randomized Complete Block Design (15.3)

2
2 1-Factor ANOVA Model y ij = i ij y ij = i ij or unexplained part mean for i th treatment observed data

3
3 were rewritten as:

4
4 In words: TSS (total SS) = total sample variability among y ij values SSB (SS “between”) = variability explained by differences in group means SSW (SS “within”) = unexplained variability (within groups)

5
5 Analysis of Variance Table Note: unequal sample sizes allowed

6
6 Extracted from From Ex. 8.2, page Methods for Reducing Hostility 12 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method Method Method Test:

7
7 ANOVA Table Output – extracted hostility data - calculations done in class Source SS df MS F p-value Between <.001 samples Within samples Totals

8
Protected LSD: Preceded by an F-test for overall significance. Unprotected: Not preceded by an F-test (like individual t-tests). Only use the LSD if F is significant. Fisher’s Least Significant Difference (LSD) X

9
9 Hostility Data - Completely Randomized Design The GLM Procedure t Tests (LSD) for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square Critical Value of t Least Significant Difference Means with the same letter are not significantly different. t Grouping Mean N method A B B B

10
10 Ex. 8.2, page Methods for Reducing Hostility 24 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method Method Method Test: Notice unequal sample sizes

11
11 ANOVA Table Output – full hostility data Source SS df MS F p-value Between <.0001 samples Within samples Totals

12
12 The GLM Procedure t Tests (LSD) for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 21 Error Mean Square Critical Value of t Comparisons significant at the 0.05 level are indicated by ***. Difference method Between 95% Confidence Comparison Means Limits *** *** *** *** *** *** Notice the different format since there is not one LSD value with which to make all pairwise comparisons.

13
13 Duncan's Multiple Range Test for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 21 Error Mean Square Harmonic Mean of Cell Sizes NOTE: Cell sizes are not equal. Number of Means 2 3 Critical Range Means with the same letter are not significantly different. Duncan Grouping Mean N method A B C Note: Duncan’s test (another multiple comparison test) avoids the issue of different sample sizes by using the harmonic mean of the n i ’ s.

14
14 Some Multiple Comparison Techniques in SAS FISHER’S LSD (LSD) BONFERONNI (BON) DUNCAN STUDENT-NEWMAN-KEULS (SNK) DUNNETT RYAN-EINOT-GABRIEL-WELCH (REGWQ) SCHEFFE TUKEY

15
Balloon Data Col observation number Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col inflation time in seconds

16
Balloon Data Col observation number Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col inflation time in seconds

17
17 ANOVA --- Balloon Data General Linear Models Procedure Dependent Variable: TIME Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE TIME Mean Mean Source DF Type I SS Square F Value Pr > F Color

18
18 ANOVA --- Balloon Data The GLM Procedure t Tests (LSD) for time NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 28 Error Mean Square Critical Value of t Least Significant Difference Means with the same letter are not significantly different. t Grouping Mean N color A A A B B B

19
19 Experimental Design: Concepts and Terminology Designed Experiment - an investigation in which a specified framework is used to compare groups or treatments Factors - up to this point we’ve only looked at experiments with a single factor - any feature of the experiment that can be varied from trial to trial

20
20 Experimental Units - subjects, material, etc. to which treatment factors are randomly assigned - there is inherent variability among these units irrespective of the treatment imposed Replication - we usually assign each treatment to several experimental units - these are called replicates - conditions constructed from the factors (levels of the factor considered, etc.) Treatments

21
21 Examples: Car Data Hostility Data Balloon Data 1. factor 2. treatments 3. experimental units 4. replicates

22
Balloon Data Col observation number (run order) Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col inflation time in seconds Why randomize run order? i.e. why not blow up all the pink balloons first, blue balloons next, etc? Question:

23
23 Scatterplot Using GPLOT What do we learn from this plot? Run Order Time

24
24 RECALL: 1-Factor ANOVA Model - random errors follow a Normal ( N) distribution, are independently distributed ( ID ), and have zero mean and constant variance -- i.e. variability does not change from group to group

25
25 Model Assumptions: Checking Validity of Assumptions 1. F-test similar to 2-sample case - Hartley’s test (p.366 text) - not recommended 2. Graphical - side-by-side box plots - equal variances - normality Equal Variances

26
26 Graphical Assessment of Equal Variance Assumption

27
27 Note: Optional approaches if equal variance assumption is violated: 1. Use Kruskal Wallis nonparametric procedure – Section Transform the data to induce more nearly equal variances – Section log -- square root Note: These transformations may also help induce normality

28
28 y ij = i ij Assessing Normality of Errors ij = y ij ( i ) so ij is estimated by = y ij i The e ij ’ s are called residuals.

29
29 proc glm; class color; model time=color; title 'ANOVA --- Balloon Data'; output out=new r=resball; means color/lsd; run; proc sort; by color; run; proc boxplot; plot time*color; title 'Side-by-Side Box Plots for Balloon Data'; run; proc univariate; var resball; histogram resball/normal; title 'Histogram of Residuals -- Balloon Data'; run; proc univariate normal plot; var resball; title 'Normal Probability Plot for Residuals - Balloon Data'; run; proc gplot; plot time*id; title 'Scatterplot of Time vs ID (Run Order)'; run; SAS Code for Balloon Data

30
30 Normal Probability Plot *+ | * *+++ | *+++ | +*+ | *** | **** 0.5+ ***+ | ++** | ++*** | ***** | +*+ | *+*+* *

31
31 Caution: Chapter 15 introduces some new notation - i.e. changes notation already defined

32
32 Recall: Sum-of-Squares Identity 1-Factor ANOVA In words: T otal SS = SS between samples + within sample SS

33
33 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15

34
34 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15

35
35 Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15 In words: T otal SS = SS for “treatments” + SS for “error”

36
36 Revised ANOVA Table for 1-Factor ANOVA (Ch. 15 terminology - p.857) Source SS df MS F Treatments SST t 1 Error SSE N t Total TSS N

37
37 Recall 1-factor ANOVA (CRD) Model for Gasoline Octane Data y ij = i ij y ij = i ij or unexplained part mean for i th gasoline observed octane -- car-to-car differences -- temperature -- etc.

38
38 Gasoline Octane Data Question: What if car differences are obscuring gasoline differences? Similar to diet t-test example: Recall: person-to-person differences obscured effect of diet

39
39 Possible Alternative Design for Octane Study: Test all 5 gasolines on the same car - in essence we test the gasoline effect directly and remove effect of car-to-car variation Question: How would you randomize an experiment with 4 cars?

40
40 Blocking an Experiment - dividing the observations into groups (called blocks) where the observations in each block are collected under relatively similar conditions - comparisons can many times be made more precisely this way

41
41 Terminology is based on Agricultural Experiments Consider the problem of testing fertilizers on a crop - t fertilizers - n observations on each

42
42 Completely Randomized Design A A B B C C B A C C B A A B C t = 3 fertilizers n = 5 replications - randomly select 15 plots - randomly assign fertilizers to the 15 plots

43
43 Randomized Complete Block Strategy B | A | C A | C | B C | A | B A | B | C C | B | A t = 3 fertilizers - select 5 “blocks” - randomly assign the 3 treatments to each block Note: The 3 “plots” within each block are similar - similar soil type, sun, water, etc

44
44 Randomized Complete Block Design Randomly assign each treatment once to every block Car Example Car 1: randomly assign each gas to this car Car 2:.... etc. Agricultural Example Randomly assign each fertilizer to one of the 3 plots within each block

45
45 y ij = i j ij Model For Randomized Complete Block (RCB) Design effect of i th treatment effect of j th block unexplained error (car)(gasoline) -- temperature -- etc.

46
46 Previous Data Table from Chapter 8 for 1-factor ANOVA column averages don’t make any sense

47
47 Back to Octane data: Suppose that instead of 20 cars, there were only 4 cars, and we tested each gasoline on each car. “Restructured” Data A B C D E Old Data Format Car Gas A B C D E Gas

48
48 Recall: Sum-of-Squares Identity 1-Factor ANOVA - using new notation for Chapter 15 In words: T otal SS = SS for “treatments” + SS for “error”

49
49 A New Sum-of-Squares Identity In words: T otal SS = SS for treatments + SS for blocks + SS for error

50
50 Hypotheses: To test for treatment effects - i.e. gas differences we test To test for block effects - i.e. car differences (not usually the research hypothesis) we test

51
51 Randomized Complete Block Design ANOVA Table Source SS df MS F Treatments SST t 1 Blocks SSB Error SSE Total TSS bt See page 866

52
52 Test for Treatment Effects Note:

53
53 Test for Block Effects

54
54 The first variable (A - E) indicates gas as it did with the Completely Randomized Design. The second variable (B1 - B4) indicates car. A B A B A B A B B B B B B B B B C B C B C B C B D B D B D B D B E B E B E B E B “Restructured” CAR Data - SAS Format

55
55 SAS file - Randomized Complete Block Design for CAR Data INPUT gas$ block$ octane; PROC GLM; CLASS gas block; MODEL octane=gas block; TITLE 'Gasoline Example -Randomized Complete Block Design'; MEANS gas/LSD; RUN;

56
56 1-Factor ANOVA Table Output - octane data Source SS df MS F p-value Gas (treatments) Error Totals

57
57 1-Factor ANOVA Table Output - car data Source SS df MS F p-value Gas (treatments) Cars (blocks) Error Totals

58
58 Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE OCTANE Mean Source DF Anova SS Mean Square F Value Pr > F GAS BLOCK SAS Output -- RCB CAR Data

59
Multiple Comparisons in RCB Analysis

60
60 t Grouping Mean N gas A E B D B C B C C B C B B C C A t Grouping Mean N gas A E B D B C B C C C B C C A CAR Data -- LSD Results CRD Analysis RCB Analysis

61
61 Bon Grouping Mean N gas A E A B A D B B C B B B B B A CAR Data -- Bonferroni Results CRD Analysis RCB Analysis Bon Grouping Mean N gas A E B D B B C B B B B B A

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google