Presentation is loading. Please wait.

Presentation is loading. Please wait.

MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph.

Similar presentations


Presentation on theme: "MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph."— Presentation transcript:

1 MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph A. Martineau Psychometrician Office of Educational Assessment & Accountability Michigan Department of Education

2 MDE / OEAA 2 Why Value Added? Value Added measures of achievement are being discussed as a possible addition to the regulations of No Child Left Behind (NCLB). –Various ways of implementing Value Added in NCLB are possible –One likely implementation of Value Added is as another way to make safe harbor if the percent proficiency targets are not met

3 MDE / OEAA 3 What is Value Added? In accountability, Value Added is a term that describes the part of achievement (or change in achievement) that is attributable to the effectiveness of a unit (teacher or school) Positive estimates indicate units that are above average, negative estimates indicate that units are below average Defining what is attributable to the effectiveness of a unit is a matter of philosophical debate

4 MDE / OEAA 4 The Logic of Value Added Holding educators accountable for student performance has many pitfalls –Educators cannot control their students’ incoming achievement –Educators cannot control the effectiveness of their students previous teachers/schools –Educators cannot control the effects of non- instructional student characteristics such as… Poverty Parental education Mobility Home environment Etcetera…

5 MDE / OEAA 5 The Logic of Value Added, Continued… Value Added Models (VAM) attempt to obtain pure estimates of the contribution of educators to student achievement and/or growth in achievement –The promise of VAM is that educators are held accountable only for their impact on student learning –The idea is not rocket science (Sanders), but the implementation is (Reckase)

6 MDE / OEAA 6 The Idea Is Not Rocket Science For each school… Estimate the expected average achievement or gain score Calculate the observed average achievement or gain score Subtract the expected from the observed average score Define the resulting difference between expected and observed scores as the value added by the school

7 MDE / OEAA 7 The Idea Is Not Rocket Science Adjusting Achievement Targets to be More Fair to Educators

8 MDE / OEAA 8 The Idea Is Not Rocket Science Adjusting Gain Targets to be More Fair to Educators (Tennessee Model)

9 MDE / OEAA 9 The Idea Is Not Rocket Science Adjusting Gain Targets to be More Fair to Educators (Dallas Model)

10 MDE / OEAA 10 The Idea Is Getting Closer to Rocket Science Adjusting Yearly Gain Targets to Meet a Final Achievement Goal (Thum Model)

11 MDE / OEAA 11 The Implementation IS Rocket Science In a Growth-Based VAM, For Each School You Must… 1.Specify a Mixed Model (a sophisticated statistical procedure that accounts for the structure of data coming from multiple occasions for each student, and multiple students per unit) 2.Estimate an overall average gain for each school year, and for the entire set of students and schools 3.Estimate a unique expected average gain for each school year and school 4.Estimate the difference between the school’s actual average trajectory and the expected average trajectory for each school year and school 5.Keep track of previous schools’ effects so that they don’t get counted toward later schools 6.Estimate a unique expected gain for each school year, student, and school 7.Estimate the difference between the expected gain and the actual gain for each school year, student, and school 8.Keep track of all differences across years so that a student’s high growth in one year is not counted toward all subsequent years 9.Estimate all of these expected and actual gains together so that they are unbiased and reliable 10.Do this all using a sparse data matrix, which causes ordinary software to choke 11.So, you write your own software, and develop new applications of statistical theory to make your idea work 12.Communicate the results in an understandable fashion to stakeholders

12 MDE / OEAA 12 The Problem with Rocket Science And with rocket science, many things can cause large distortions in the results of VAM, including –Small problems with the scales of measurement –Small programming errors –Small errors in assumptions needed for the statistical models to work appropriately

13 MDE / OEAA 13 Statistical Issues in VAM 50 years ago, researchers despaired of every being able to measure growth validly, because the statistical issues seemed insurmountable Most of the statistical issues have been solved by the introduction of Statistical Mixed Models

14 MDE / OEAA 14 Statistical Issues in VAM, Continued… For VAM, one very significant statistical issue remains –The parts of the statistical models that produce estimates of Value Added were originally included in statistical models with the purpose of accounting for sources of error so that other effects were easier to identify. Therefore… Therefore, estimates of value added can also be classified as error terms Estimates of Value Added are technically the portion of achievement or gains that cannot be explained by anything else included in the model In effect, the implementation of a Value-Added Model says “whatever portion of achievement and/or growth we do not know how to explain is to be attributed to schools”

15 MDE / OEAA 15 Statistical Issues in VAM, Continued… Philosophical, ethical, and political considerations of attributing to schools all achievement/gains that cannot be explained any other way –Do we have to remove differences explained by ethnicity before we can attribute the rest to schools? –Do we have to remove differences explained by poverty before we can attribute the rest to schools? –Etcetera… –Is it possible to ever satisfy the majority of stakeholders that what’s left over is pure enough to hold schools accountable for? No matter how we answer these questions, it raises additional philosophical, ethical, and political concerns.

16 MDE / OEAA 16 Ethical Issues in VAM, Continued… VAMs as Currently Implemented –Focus lies squarely on being fair to educators In TN and OH… –All educators are expected to produce the same average gains in their students –The achievement gap is expected to remain as it was because educators or lower-achieving groups of students are not expected to help their students catch up In Dallas… – All educators are expected to produce gains in their students that are equivalent to the average gains achieved by similar groups of students –The achievement gap may be expected to widen because lower performing groups of students may achieve lower average gains than other groups of students

17 MDE / OEAA 17 Ethical Issues in VAM, Continued… Where does VAM take into account fairness for low- performing students? –Currently implemented VAMs say basically, “I need to see one year’s growth for one year of instruction” where (as in the Dallas model), one year’s worth of growth can be less for some groups of students than for others –Because of concerns about being fair to educators, groups of students that start out behind are left behind by the same amount (or even more) –Thum model is a compromise that expects a modest amount more of educators serving low-achieving students, but that the gap will be closed over many grades Not really a VAM A mixture of status and growth

18 MDE / OEAA 18 Political Issues in VAM Complexity –Rocket Science is a political liability –As more of the statistical and ethical issues of VAM are addressed, VAMs are likely to become even more inaccessible to the lay audience –VAM requires an extraordinary amount of trust in those who implement the system Ethical issues will be decided by a political process that does not necessarily account for the best interest of students and educators, e.g.… –Dallas: Focus on best interests of educators at the possible price of increasing achievement gaps –TN, HO: Focus on best interests of educators at the possible price of leaving achievement gaps as they are –Thum: Focus on best interests of low-performing groups at the possible expense of (1) high-performing groups of students, and (2) making low- achieving schools less attractive to qualified teachers –The state of the art in VAM is incapable of providing for both high achievement for all students and fairness in evaluating educators of lower-performing students

19 MDE / OEAA 19 Measurement Issues in VAM Having solved most of the statistical issues in VAM, the measurement issues have been forgotten in the excitement

20 MDE / OEAA 20 Measurement Issues in VAM, Continued… Assumes that the same thing is being measured at every grade level of the test –Presents a dilemma In order to measure validly, we have to measure what is being taught, which changes over grade levels In order to calculate growth, gains, and value-added, we have to measure the same thing every time we measure –Value added models are being applied to “construct-shifting” scales as if the scales were interval-level measures of student achievement on unchanging content

21 MDE / OEAA 21 Cautions in using Vertical Scales Scholars have been warning against the use of construct-shifting scales to measure growth for 50+ years However, the use of vertical scales in growth models has become increasingly prevalent in scholarly literature with the advent of recent statistical developments (HLM and SEM) So am I just straining at gnats? –Can’t I just use vertical scales to measure growth? –What harm can it do? –How big is the effect of changing content on growth- and growth-based value-added models?

22 MDE / OEAA 22 Hypothetical example A vertically scaled mathematics test –Grades 3-8 –Composed of only two constructs Basic Computation (BC) Problem Solving (PS) BC is heavily represented in early grades PS is heavily represented in later grades –Only the single, combined math score is available (BC and PS are just in the background)

23 MDE / OEAA 23 Hypothetical example

24 MDE / OEAA 24 Hypothetical Example

25 MDE / OEAA 25 Hypothetical Example

26 MDE / OEAA 26 The Effects of Construct Shift Construct shift affects –The estimation of educational effectiveness (the results of Value-Added Models) –Does not accurately identify effectiveness if student achievement is outside the range measured well by the grade-level test –Attributes effectiveness of prior teachers/schools to current teachers/schools (violates the promise of Value-Added Models)

27 MDE / OEAA 27

28 MDE / OEAA 28 Reliability Ratio of construct-related variance to total variance (construct-related plus non- construct-related variance) Extend to Value-Added Models –Ratio of variance in true value added to total variance (true value-added variance plus variance of distortions) How important is this distortion, especially when the constructs are correlated?

29 MDE / OEAA 29 Reliability Martineau (in press) derived an an upper bound on reliability of VAM Affected by content balance (more balanced means lower reliability) Affected by correlation in value added (higher correlation means higher reliability) Affected by grade level (later grades have lower reliability) Affected by magnitude of changes in content across grades (larger changes mean lower reliability)

30 MDE / OEAA 30 Reliability of VAM Results

31 MDE / OEAA 31 Reliability Only in extraordinary circumstances are the results reliable enough for high-stakes use For research use, the results may be reliable enough in some limited circumstances

32 MDE / OEAA 32 Alleviating low reliability of value- added analyses Twice a year testing –Not politically viable –Completely eliminates low reliability Once yearly testing, new equating design –Embed the entire set of below-grade items on the current grade test by including a small portion of the set on each of multiple test forms –Calibrate a separate vertical scale for each adjacent pair of grades (e.g. 3/4, 4/5, 5/6…) –Concurrent calibration of grade 3 and 4 items together, 4 and 5 items together, 5 and 6 items together… –Should markedly reduce the amount of construct shift, and increase the reliability to an acceptable degree

33 MDE / OEAA 33 Contact Information Joseph Martineau Office of Educational Assessment & Accountability Michigan Department of Education P.O. Box 30008 Lansing, MI 48909 (517) 241-4710 martineauj@michigan.gov


Download ppt "MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph."

Similar presentations


Ads by Google