Download presentation

Presentation is loading. Please wait.

Published byAnita Levey Modified over 2 years ago

1
Douglas N. Harris University of Wisconsin at Madison Evaluating and Improving Value-Added Modeling

2
IES Teacher Quality Grant; Harris and Sass IES Teacher Quality Grant; Harris and Sass 2006 IES conference 2006 IES conference November mini-conference at UW-Madison November mini-conference at UW-Madison Caveat: Multidisciplinary group but “econ- centric” presentation Caveat: Multidisciplinary group but “econ- centric” presentation Background

3
Summary Purposes of value-added modeling (VAM) Purposes of value-added modeling (VAM) Criteria for evaluating VAM Criteria for evaluating VAM Some problematic results Some problematic results Methodological issues Methodological issues A research agenda and upcoming conference A research agenda and upcoming conference

4
Different Purposes There are main purposes of value-added models: There are main purposes of value-added models: (1) VAM for program evaluation (VAM-P) (2) VAM for accountability (VAM-A) In both cases, arguably trying to mimic random assignment experiments In both cases, arguably trying to mimic random assignment experiments

5
Criteria for Evaluating VAM Different purposes, different criteria for evaluation: Different purposes, different criteria for evaluation: Criteria for VAM-P: validity and reliability of the program/policy effect parameter Criteria for VAM-A: validity and reliability of individual personnel effects Meeting the criteria appears more difficult with VAM-A with hundreds or thousands of parameters Meeting the criteria appears more difficult with VAM-A with hundreds or thousands of parameters

6
Tentative, But Problematic, Findings In some VAM-A models, teacher effects are unstable for individual teachers over time In some VAM-A models, teacher effects are unstable for individual teachers over time When comparing teacher effects estimated from the same data but different VAM-A models, the results are weakly correlated When comparing teacher effects estimated from the same data but different VAM-A models, the results are weakly correlated VAM-A teacher effects are imprecise, making it difficult to distinguish teacher effectiveness with the usual degree of confidence VAM-A teacher effects are imprecise, making it difficult to distinguish teacher effectiveness with the usual degree of confidence

7
Methodological Issues Assumptions about student test scores Assumptions about student test scores Assumptions about teaching and learning Assumptions about teaching and learning Others: amount of information, complexity of computation, missing data Others: amount of information, complexity of computation, missing data Significance of methodological issues vary by purpose (VAM-P vs. VAM-A) Significance of methodological issues vary by purpose (VAM-P vs. VAM-A)

8
Assumptions about Test Scores VAM assumes that test scores are on an interval scale VAM assumes that test scores are on an interval scale - In other words, a one-point increase means the same thing no matter where we start - In other other words, vertical scaling works Some (many?) psychometricians believe that, despite best efforts, test scores are not really interval scale Some (many?) psychometricians believe that, despite best efforts, test scores are not really interval scale Ad hoc adjustments may not solve the problem Ad hoc adjustments may not solve the problem - non-linear term on right-hand side - grade-by-year fixed effects

9
Assumptions about Learning VAM models make assumptions about learning decay of past learning/inputs VAM models make assumptions about learning decay of past learning/inputs All VAM models assume that nothing happens between the test administration and the beginning of the subsequent school year All VAM models assume that nothing happens between the test administration and the beginning of the subsequent school year - summer learning loss VAM models do NOT assume, however, that students learn “smoothly” VAM models do NOT assume, however, that students learn “smoothly” - some express concern that students learn in spurts in ways that are independent of instructional quality

10
Assumptions about Teaching VAM-A assumes that the mediating factors influencing student achievement influence effectiveness of all teachers in the same way VAM-A assumes that the mediating factors influencing student achievement influence effectiveness of all teachers in the same way - e.g., class size A specific and important example is the assumption that teachers are equally effective with all types of students A specific and important example is the assumption that teachers are equally effective with all types of students

11
Lots of Assumptions & Problems, But... Even with modest validity and reliability, Even with modest validity and reliability, VAM-A could improve education: - The education system already uses student test scores—and uses them badly - Violations of assumptions per se do not invalidate VAM-A Little question that VAM-P should be pursued Little question that VAM-P should be pursued

12
Short-Term Research Agenda Follow-up on earlier “problematic” findings Follow-up on earlier “problematic” findings - in progress: testing robustness of teacher effects across VAM-A models Clarify assumptions being made in each type of VAM model Clarify assumptions being made in each type of VAM model Test sensitivity of VAM results to test scaling (and test type) Test sensitivity of VAM results to test scaling (and test type) Test whether teachers have different levels of effectiveness with different types of students (e.g., different initial test scores) Test whether teachers have different levels of effectiveness with different types of students (e.g., different initial test scores)

13
Long-Term Research Agenda Test VAM with experiments Test VAM with experiments Study the effects of VAM-A on school decision-making Study the effects of VAM-A on school decision-making - Does VAM-A (w/o high stakes) appear to yield better decisions about, for example, the allocation of school resources? - Does VAM-A w/ merit pay result in higher test higher student scores? (i.e. use VAM-P to evaluate VAM-A) - Do these changes in scores reflect real improvements in learning or gaming the system? - Studies in progress

14
For All Future VAM Work... Be explicit about assumptions and their potential implications Be explicit about assumptions and their potential implications Test the assumptions Test the assumptions Where assumptions fail, compare different models to test for robustness Where assumptions fail, compare different models to test for robustness

15
Steps Down the Path A larger national conference in Madison, WI in Spring, 2008 A larger national conference in Madison, WI in Spring, 2008 Co-Chairs: Harris, Gamoran, Raudenbush Co-Chairs: Harris, Gamoran, Raudenbush Program Committee members: Braun, Lockwood, Meyer, Sass Program Committee members: Braun, Lockwood, Meyer, Sass Interdisciplinary Interdisciplinary 10 commissioned papers, plus policy discussions 10 commissioned papers, plus policy discussions

16
Final Thoughts There is considerable interest in VAM and policymakers are eager for direction There is considerable interest in VAM and policymakers are eager for direction Is (or should be) near consensus that VAM-P is an important advance Is (or should be) near consensus that VAM-P is an important advance - policymakers should push forward in collecting student-level data with unique student identifiers VAM-A is worth cautious experimentation and further study, but not yet widespread adoption with high-stakes VAM-A is worth cautious experimentation and further study, but not yet widespread adoption with high-stakes

Similar presentations

OK

John Cronin, Ph.D. Director The Kingsbury NWEA Measuring and Modeling Growth in a High Stakes Environment.

John Cronin, Ph.D. Director The Kingsbury NWEA Measuring and Modeling Growth in a High Stakes Environment.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google