G89.2247 Lecture 5 Example fixed Repeated measures as clustered data Clusters as random effects Intraclass correlation ANOVA approach PROC MIXED Approach G89.2247 Lecture 5
Whoops! A Mistake Fixed Checking data is important Preparing charts for today alerted us to the fact that POMS in the Comparison Group was on a 1-5 scale, while the Bar Exam Group was 0-4. G89.2247 Lecture 5
Repeated Measures as Clustered Data There are many examples when clusters of data are collected Siblings within a family Children within a classroom Households within a Primary Sampling Unit Repeated measures are a special case of clustered data Times within a person But ... measurements are ordered by time G89.2247 Lecture 5
Thinking generally about clusters Suppose we sampled study groups of size four Each group has four measurement (persons in this case) Measurements are not usually ordered Observations within a cluster may be more similar than observations sampled across clusters If we have 135 study groups, is it fair to analyze the 135*4=540 persons as though they were independent observations? G89.2247 Lecture 5
Clusters as Random Effects Sampling clusters are often considered to be Random Effects Clusters are informative about overall population Actual choice of a specific cluster is due to chance Cluster effects are best thought in variance terms Snijders and Bosker call the clusters Macro level units Elements within the cluster are called Micro level units G89.2247 Lecture 5
A One-way Random Effects Model According to S&B, the observation Y for the ith observation in the jth cluster (macro level) is Yij = m + Uj + Rij where m is the overall mean of the population, Uj is the effect of randomly selected macro-unit j Rij is the effect of randomly selected micro-unit i in randomly selected macro-unit j. Define Var(Uj) = t2, Var(Rij) = s2, Var(Yij) = t2 + s2 (assuming Corr(U,R)=0) Bryk and Raudenbush notation (sort of) A randomly chosen observation varies as a function of cluster variation and within cluster variation. G89.2247 Lecture 5
One-way random effects interpreted Yij = m + Uj + Rij Suppose clusters are monozygotic twins and Y is a measure of eye color All of the variation in Y would be due to between twin effects (macro-unit U). R would reflect measurement error only Suppose clusters were pairings of persons who report for subject pool studies There might be some cluster effects due to subtle personality differences in when people volunteer, but most variation in Y would be due to micro-unit R G89.2247 Lecture 5
How much of Var(Y) is due to Macro-level variation? The Intraclass correlation is used to quantify how much of Var(Y) is due to Var(U). Assume we can get estimates of Var(U)=t2 and Var(R)=s2. These will come from either ANOVA or special software. ICC = r = t2/(t2 + s2) G89.2247 Lecture 5
ICC interpreted as a correlation The correlation between any two observations within a cluster U*j r r Y*1j Y*2j Corr(Y1j, Y2j)= r G89.2247 Lecture 5
Example of ICC from ANOVA Suppose we consider the 135 persons from the examinee and comparison groups to be clusters with four replications Ignore the ordering of replications Let's think of the replications as random effects SPSS Reliability can give us the estimate of Intraclass Correlation G89.2247 Lecture 5
SPSS example RELIABILITY /VARIABLES=week1 week2 week3 week4 /SCALE(persons)=ALL/MODEL=ALPHA /STATISTICS=DESCRIPTIVE SCALE ANOVA /ICC=MODEL(ONEWAY) CIN=95 TESTVAL=0 . G89.2247 Lecture 5
Analysis of Variance Table Source of Variation Sum of Sq. DF Mean Square F Prob. Between People 357.0247 134 2.6644 Within People 80.0973 405 .1978 Between Measures 11.0192 3 3.6731 21.3754 .0000 Residual 69.0781 402 .1718 Total 437.1220 539 .8110 Grand Mean 1.0291 Intraclass Correlation Coefficient One-way random effect model: People Effect Random Single Measure Intraclass Correlation = .7572 95.00% C.I.: Lower = .6995 Upper = .8091 F = 13.4720 DF = ( 134, 405.0) Sig. = .0000 (Test Value = .0000 ) G89.2247 Lecture 5
Where Are the Variance Estimates? The ANOVA table shows where to get the estimate of Var(R)=s2. We use the "Within people" Mean Square, which is MSW=.1978. (S&B Eq. 3.10) To get Var(U)=t2 is a bit more work. E(MSB) = 4t2 + s2 Estimate(t2) = (MSB-MSW)/4 (2.6644-.1978)/4 = .6166 ICC = .6166/(.6166+.1978) = .7572 G89.2247 Lecture 5
Interpreting ICC In this example, 76% of the variance of the anxiety scores is due to macro-unit differences Some of the macro-unit variation may be due to examinee/comparison differences Within the comparison group the ICC is still .75 But the estimates of t2 and s2 are smaller than overall Within the examinee group the ICC is .59 The within macro-unit variance is relatively large in this case. G89.2247 Lecture 5
Studying Random Effects using SAS PROC MIXED The ANOVA procedure may be familiar, but it is not the easiest way to study the one way random effects model DATA anxgrps; infile 'bothanxst.dat'; input id 1-4 week 5-7 group 8-10 anx 11-15; id = id+100*group; *assign unique IDs to subjects; week=week-2.5; *center week at week 2.5; Proc sort; by id; Proc mixed covtest NOCLPRINT ; Class id; MODEL anx= /s; RANDOM Intercept /Subject=ID g; run; G89.2247 Lecture 5
The Mixed Procedure Dimensions Covariance Parameters 2 Subjects 135 Max Obs Per Subject 4 Observations Used 540 Observations Not Used 0 Total Observations 540 Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept id 0.6163 0.08141 7.57 <.0001 Residual 0.1977 0.01389 14.23 <.0001 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 1.0294 0.07023 134 14.66 <.0001 G89.2247 Lecture 5
Comparison of Random Effects and Means G89.2247 Lecture 5
Extension 1: Fixed Cluster Effects We can carry out the equivalent of a two sample t test. Group is a Fixed Effect Proc mixed covtest NOCLPRINT ; Class id; MODEL anx=group /s; RANDOM Intercept /Subject=ID g ; run; G89.2247 Lecture 5
PROC MIXED Two Group Results Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept id 0.3597 0.05028 7.15 <.0001 Residual 0.1977 0.01389 14.23 <.0001 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 1.5335 0.07756 133 19.77 <.0001 group -1.0155 0.1101 405 -9.22 <.0001 G89.2247 Lecture 5
Extension 2: Time, Group and Time-by-Group Considered Proc mixed covtest; Class id; MODEL anx=week group group*week /s; RANDOM Intercept week /Subject=ID type=un g gcorr; run; G89.2247 Lecture 5
PROC MIXED Results Row Effect id Col1 Col2 1 Intercept 1 1.0000 0.4222 Estimated G Correlation Matrix Row Effect id Col1 Col2 1 Intercept 1 1.0000 0.4222 2 week 1 0.4222 1.0000 Covariance Parameter Estimates Cov Parm Subject Estimate S Error Z Value Pr Z UN(1,1) id 0.3828 0.05022 7.62 <.0001 UN(2,1) id 0.0361 0.01154 3.13 0.0018 UN(2,2) id 0.0191 0.005236 3.65 0.0001 Residual 0.1049 0.009032 11.62 <.0001 Solution for Fixed Effects Effect Estimate S Error DF t Value Pr > |t| Intercept 1.5335 0.07756 133 19.77 <.0001 week 0.2706 0.02428 133 11.14 <.0001 group -1.0155 0.1101 270 -9.22 <.0001 week*group -0.2942 0.03446 270 -8.54 <.0001 G89.2247 Lecture 5
Creating a data File with One Line Per Observation in SPSS write outfile='bothanxst.dat' records=4 /1 id 1-4 ' 1 ' sample 9-10 week1 (f5.2) /2 id 1-4 ' 2 ' sample 9-10 week2 (f5.2) /3 id 1-4 ' 3 ' sample 9-10 week3 (f5.2) /4 id 1-4 ' 4 ' sample 9-10 week4 (f5.2). execute. G89.2247 Lecture 5
Reading a data file with four observations per line in SAS data new; infile 'G2247_1.dat'; time=1; input id 1-4 group 5-6 supp 13-14@@; output; time=2; input id 1-4 group 5-6 supp 15-19@@; output; time=3; input id 1-4 group 5-6 supp 20-24@@; output; time=4; input id 1-4 group 5-6 supp 25-29 ; output; data new2; set new; week=time-2.5; id=id+100*group; G89.2247 Lecture 5