Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.
Lecture 11 (Chapter 9).
By Zach Andersen Jon Durrant Jayson Talakai
3-Dimensional Gait Measurement Really expensive and fancy measurement system with lots of cameras and computers Produces graphs of kinematics (joint.
1 Multilevel Mediation Overview -Mediation -Multilevel data as a nuisance and an opportunity -Mediation in Multilevel Models -
Psychology 202b Advanced Psychological Statistics, II
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
HIERARCHICAL LINEAR MODELS USED WITH NESTED DESIGNS IN EDUCATION, PSYCHOLOGY USES RANDOM FACTORS EXPECTED MEAN SQUARE THEORY COMBINES INFORMATION ACROSS.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Experimental Design, Statistical Analysis CSCI 4800/6800 University of Georgia Spring 2007 Eileen Kraemer.
Clustered or Multilevel Data
Lecture 9: One Way ANOVA Between Subjects
Multilevel Modeling Soc 543 Fall Presentation overview What is multilevel modeling? Problems with not using multilevel models Benefits of using.
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Foster Care Reunification: The use of hierarchical modeling to account for sibling and county correlation Emily Putnam-Hornstein, MSW Center for Social.
Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.
Analysis of Clustered and Longitudinal Data
Introduction to Multilevel Modeling Using SPSS
An Introduction to HLM and SEM
Multilevel Modeling 1.Overview 2.Application #1: Growth Modeling Break 3.Application # 2: Individuals Nested Within Groups 4.Questions?
Advanced Business Research Method Intructor : Prof. Feng-Hui Huang Agung D. Buchdadi DA21G201.
Sampling and Nested Data in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
From GLM to HLM Working with Continuous Outcomes EPSY 5245 Michael C. Rodriguez.
A Random Coefficient Model Example: Observations Nested Within Individuals Academy of Management, 2010 Montreal, Canada Jeffrey B. Vancouver Ohio University.
Lecture 8: Generalized Linear Models for Longitudinal Data.
G Lecture 5 Example fixed Repeated measures as clustered data
Hierarchical Linear Modeling (HLM): A Conceptual Introduction Jessaca Spybrook Educational Leadership, Research, and Technology.
Introduction Multilevel Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Introduction to Multilevel Modeling Stephen R. Porter Associate Professor Dept. of Educational Leadership and Policy Studies Iowa State University Lagomarcino.
1 Review of ANOVA & Inferences About The Pearson Correlation Coefficient Heibatollah Baghi, and Mastee Badii.
Optimal Design for Longitudinal and Multilevel Research Jessaca Spybrook July 10, 2008 *Joint work with Steve Raudenbush and Andres Martinez.
Family/Kinship Studies Compare individuals with different degrees of genetic relatedness on a specific characteristic or behavior – Exs: adoption studies,
Chapter 13 Multiple Regression
Analysis of Covariance (ANCOVA)
Measures of Reliability in Sports Medicine and Science Will G. Hopkins Sports Medicine 30(4): 1-25, 2000.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Empirically Based Characteristics of Effect Sizes used in ANOVA J. Jackson Barnette, PhD Community and Behavioral Health College of Public Health University.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
ANCOVA. What is Analysis of Covariance? When you think of Ancova, you should think of sequential regression, because really that’s all it is Covariate(s)
General Linear Model.
T tests comparing two means t tests comparing two means.
FIXED AND RANDOM EFFECTS IN HLM. Fixed effects produce constant impact on DV. Random effects produce variable impact on DV. F IXED VS RANDOM EFFECTS.
Multilevel Modeling. Multilevel Question Turns out the Simple Random Sampling is very expensive Travel to Moscow, Idaho to give survey to a single student.
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Analysis of Experiments
Jessaca Spybrook Western Michigan University Multi-level Modeling (MLM) Refresher.
ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.
ANCOVA.
Methods of Presenting and Interpreting Information Class 9.
An Introduction to Latent Curve Models
Using Multilevel Modeling in Institutional Research
Multilevel modelling: general ideas and uses
Analysis for Designs with Assignment of Both Clusters and Individuals
REGRESSION G&W p
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
An introduction to basic multilevel modeling
Lecture 4 - Model Selection
Analyzing Intervention Studies
ELEMENTS OF HIERARCHICAL REGRESSION LINEAR MODELS
From GLM to HLM Working with Continuous Outcomes
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Simple Linear Regression
Chapter Fourteen McGraw-Hill/Irwin
An Introductory Tutorial
Presentation transcript:

Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill

Overview: Statistical analysis overview I-b Nesting and intraclass correlation Hierarchical Linear Models –2 level models –3 level models

Nesting Nesting implies violation of the linear model assumptions of independence of observations Ignoring this dependency in the data results in inflated test statistics when observations are positively correlated –CAN DRAW INCORRECT CONCLUSIONS

Nesting and Design Educational data often collected in schools, classrooms, or special treatment groups –Lack of independence among individuals -> reduction in variability Pre-existing similarities (i.e., students within the cluster are more similar than a students who would be randomly selected) Shared instructional environment (i.e., variability in instruction greater across classroom than within classroom) Educational treatments often assigned to schools or classrooms –Advantage: To avoid contamination, make study more acceptable (often simple random assignment not possible) –Disadvantage: Analysis must take dependencies or relatedness of responses within clusters into account

Intraclass Correlation (ICC) For models with clustering of individuals –“cluster effect”: proportion of variance in the outcomes that is between clusters (compares within-cluster variance to between-cluster variance) –Example – clustering of children in classroom. ICC describes proportion of variance associated with differences between classrooms

Intraclass Correlation Intraclass correlation (ICC) – measure of relatedness or dependence of clustered data –Proportion of variance that is between clusters –ICC or  =    b / (    b +    w ) –ICC = 0 } no correlation among individuals within a cluster = 1 } all responses within the clusters are identical

Nesting, Design, and ICC Taking ICC into account results in less power for given sample size –less independent information Design effect = mk / (1 +  (m-1)) –m= number of individuals per cluster –K=number of clusters –  =ICC Effective sample size is number of clusters (k) when ICC=1 and is number of individuals (mk) when ICC=0

ICC and Hierchical Linear Models Hierarchical linear models (HLM) implicitly take nesting into account –Clustering of data is explicitly specified by model –ICC is considered when estimating standard errors, test statistics, and p-values

2 level HLM One level of nesting –Longitudinal: Repeated measures of individual over time Typically - Random intercepts and slopes to describe individual patterns of change over time –Clusters: Nesting of individuals within classes, families, therapy groups, etc. Typically - Random intercept to describe cluster effect

2 level HLM Random-intercepts models Corresponds to One-way ANOVA with random effects (mixed model ANOVA) Example: Classrooms randomly assigned to treatment or control conditions –All study children within classroom in same condition –Post treatment outcome per child (can use pre-treatment as covariate to increase power) –Level 1 = children in classroom Level 2 = classroom ICC reflects extent the degree of similarity among students within the classroom.

2 Level HLM Random Intercept Model Level 1 – individual students within the classroom –Unconditional Model: Y ij = B 0j + r ij –Conditional Model: Y ij = B 0j + B 1 X ij + r ij Y ij = outcome for i th student in j th class B 0j = intercept (e.g., mean) for j th class B 1 = coefficient for individual-level covariate, X ij r ij = random error term for i th student in j th class, E ( r ij ) = 0, var (r ij ) =  

2 Level HLM Random Intercept Model Level 2 – Classrooms –Unconditional model: B 0j =  00 + u 0j –Conditional model: B 0j =  00 +  01 W j1 +  02 W j2 + u 0j B 0j j = intercept (e.g., mean) for j th class  00 = grand mean in population  01 = treatment effect for W j, dummy variable indicating treatment status -.5 if control;.5 if treatment  02 coefficient for W j2, class level covariate u 0j = random effect associated with j-th classroom E (u ij ) = 0, var (u ij ) =  

2 Level HLM Random Intercept Model Combined (unconditional) –Y ij =  00 + u 0j + r ij Y ij = B 0j + r ij B 0j =  00 + u 0j Combined (conditional) –Y ij =  00 +  01 W j +  02 W j2 + B 1 X ij + u 0j + r ij Y ij = B 0j + B 1 X ij + r ij B 0j =  00 +  01 W j +  02 W j2 + u 0j Var (Y ij ) = Var ( u 0j + r ij ) = (      ICC =  =    (     

Example 2 level HLM Random Intercepts Purdue Curriculum Study (Powell & Diamond) –Onsite or Remote coaching –27 Head Start classes randomly assigned to onsite coaching and 25 to remote coaching –Post-test scores on writing –Onsite: n=196, M=6.70, SD=1.54 Remote: n=171, M=7.05, SD=1.64

Example 2 level HLM Random Intercepts Level 1: Writing ij = B 0j + B 1 Writing-pre ij + r ij B 1 =.56, se=.05, p<.001 E ( r ij ) = 0, var (r ij ) = 1.67 Level 2: B 0j =  00 +  01 Onsite j + u 0j  00 (intercept- remote group adjusted mean) = 3.74, se =.31  01 (Onsite-Remote difference) = -.37, se=.17, p=.03 E (u ij ) = 0, var (u ij ) =  ICC =    (       

2 Level HLM - Longitudinal (random-slopes and –intercepts models) Corresponds NOT to One-way ANOVA with random effects Example: Longitudinal assessment of children’s literacy skills during Pre-K years –Level 1 = individual growth curve Level 2 = group growth curve

Level 1- Longitudinal HLM Level 1 – individual growth curve –Unconditional Model: Y ij = B 0j + B 1j Age ij + r ij –Conditional Model: Y ij = B 0j + B 1j Age ij + B 2 X ij + r ij Y ij = outcome for i th student on the j th occasion Age ij = age at assessment for i th student on the j th occasion B 0j = intercept for i th student B 1j = slope for Age for i th student B 2 = coefficient for tiem-varying covariate, X ij\ r ij = random error term for i th student on the j th occasion E ( r ij ) = 0, var (r ij ) =  

Level 2 – Longitudinal HLM Level 2 – predicting individual trajectories –Unconditional model: B 0j =  00 + u 0j B 1j =  10 + u 1j –Conditional model: B 0j =  00 +  01 W j1 +  02 W j2 + u 0j B 1j =  10 +  11 W j1 +  12 W j2 + u 1j B 0j = intercept for i th student B 1j = slope for Age for i th student  00 = intercept in population  10 = slope in population  01 = treatment effect on intercept for W j, student - level covariate  11 = treatment effect on slope for W j, student - level covariate

Level 2 – Longitudinal HLM Level 2 – predicting individual trajectories –Unconditional model:B 0j =  00 + u 0j B 1j =  10 + u 1j –Conditional model: B 0j =  00 +  01 W j1 + u 0j B 1j =  10 +  11 W j1 + u 1j u 0j = random effect for individual intercept u 0j = random effect for individual slope E (u 0j ) = 0, var (u 0j ) =   E (u 1j ) = 0, var (u 1j ) =    cov  u 0j, u 1j ) =   var  u 0j, u 1j )=         level 1 and 2 error terms independent cov (r ij, T) = 0

Example – Longitudinal HLM Purdue Curriculum Study (Powell & Diamond) Level 1 – estimating individual growth curves for children in one treatment condition (Remote) –Level 2 – estimating population growth curves for Remote condition BlendingPrePostFollow-up N M (sd) (5.34) (4.57) (4.60)

Example Level 1: blending ij = B 0j + B 1j Age ij + r ij estimated    Level 2: B 0j =  00 +  01 W j1 + u 0j B 1j =  10 + u 1j Estimated results Intercept  00 = (se=.48),  00 = 10.03** season  01 = 2.43* (se=.70) Slope  10 = 1.51* (se=.60),  11 = 4.24**  10 = -1.45**

3 level HLM 2 levels of nesting Examples –Longitudinal assessments of children in randomly assigned classrooms Level 1 – child level data Level 2 – child’s growth curve Level 3 – classroom level data –Two levels of nesting such as children nested in classrooms that are nested in schools Level 1 – child level data Level 2 – classroom level data Level 3 – school level data

3 level Model-Random Intercepts Children nested in classrooms, classrooms nested in schools –Level 1 child-level model Y ijk =  ojk + e ijk Y ijk is achievement of child I in class J in school K  ojk is mean score of class j in school k e ojk is random “child effect” –Classroom level model  ojk =  00k + r 0jk  00k is mean score for school k r 0jk is random “class effect” –School level model  00k =  u 00k  000 is grand mean score u 00k is random “school effect”

3 level Model-Random Intercepts Children nested in classrooms, classrooms nested in schools –Level 1 child-level model Y ijk =  ojk + e ijk e ojk is random “child effect”, E (e ijk ) = 0, var(e ijk ) =   –Within classroom level model  ojk =  00k + r 0jk r 0jk is random “class effect”, E (r 0jk ) = 0, var(r 0jk ) =   Assume variance among classes within school is the same –Between classroom (school)  00k =   01 trt + u 00k E (u 00k ) = 0, var(u 00k ) =  

Partitioning variance Proportion of variance within classroom           Proportion of variance among classrooms within schools            Proportion of variance among schools           

3 Level HLM – level 2 longitudinal and level 3 random intercepts Typically – treatment randomly assigned at classroom level, children followed longitudinally (e.g., Purdue Curriculum Study) –(within child) Level 1: Y ijk =  0j k +  1j k Age ijk + r ijk E (e ijk ) = 0, var(e ijk ) =   –(between child ) Level 2:  0jk =  00k + r 0jk;  1j k =  10k + r 1jk E (r 0jk ) = 0, var(r 0jk ) =   E (r 1jk ) = 0, var(r 1jk ) =   –(between classes) Level 3:  00k =  00 + u 00k;  10k =  10 + u 10k E (u 00k ) = 0, var(u 00k ) =   E (u 10k ) = 0, var(u 10k ) =  

Example Purdue Curriculum Study Level 1 – individual growth curve Level 2 – classroom growth curve Level 3 – treatment differences in classroom growth curves WritingPrePostFollow-up Onsite M (se) N= (1.49) N= (1.54) N= (1.74) Remote M (se) N= (1.55) N= (1.64) N= (1.62)

Purdue Curriculum Study

Threats Homogeneity of variance – at each level –Nonnormal data with heavy tails –Bad data –Differences in variability among groups Normality assumption –Examine residuals –Robust standard error (large n) Inferences with small samples

3 Level HLM Longitudinal assessments of individual in clustered settings