Presentation is loading. Please wait.

Presentation is loading. Please wait.

Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 1 Measurement & Quantitative Methods Counseling, Educational.

Similar presentations


Presentation on theme: "Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 1 Measurement & Quantitative Methods Counseling, Educational."— Presentation transcript:

1

2 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 1 Measurement & Quantitative Methods Counseling, Educational Psychology, & Special Education College of Education Michigan State University Bayesian Inference for Some Value-added Productivity Indicators Yeow Meng Thum Conference on Longitudinal Modeling of Student Achievement University of Maryland, November 2005 Conference on Longitudinal Modeling of Student Achievement University of Maryland, November 2005

3 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 2 Overview and Conclusions  Recent thinking raised doubts about the validity of so-called “teacher effects” or “school effects” captured in applications We may have the models, but we do not have suitable data to be able to claim causal agency (question about design). Places doubt on whether the empirical evidence for “accountability,” on the basis of which a teacher, or a program, or school may be identified as responsible for improvement or for failure, is so directly accessible.  Purpose: Suggests that descriptive measures of productivity and improvement for accounting units, teachers or schools, are still valid given the accountability data. This is where we begin, so:  Focus: Measurement, leaving aside structural relationships (until we have better data) Employ well-defined data-base (evidence base). Build Productivity Indicators that address value-added hypotheses about growth and change. Design procedures for their inference -- Bayesian.

4 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 3 Start by Defining & Measuring Value-added Performance (Thum, 2003a) 1. Make the Accountability Data Block Explicit 2. Value-added notion is keyed on our ability to measure change. Begins with a model for the Learning Change in the student: Multivariate Multi-Cohort Growth Modeling (Thum, 2003b) 3. To Measure Change, Estimate Gains 4. Multiple Outcomes Helps 5. Employ standard error of measurement (sem) of the score 6. Metric Matters for Measuring Change 7. Require Model-based Aggregation & Inference 8. Keep the “Black-box” Open

5 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 4 Longitudinal Student Data Is the Key Evidence Base Longitudinal Student Cohorts 876543876543 01 02 03 04 05 06 07 08 876543876543 876543876543 Grade Year 01 02 03 04 05 06 07 08 Quasi-longitudinal: Longitudinal at the School-grade level Definition of Data Block must be integral to any Accountability Criteria/System Point? A “constant ballast” Standardize evidence-base to stabilize comparisons.

6 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 5 Multivariate Multi-cohort Mixed-Effects Model Within Each School j (example): Between Schools: Bayesian Multivariate Meta-Analysis

7 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 6 Why Focus on Gains?  Thum (2003) offered a summary of some reasons. The gain score is not inherently unreliable (Rogosa, others); is not always predictable from knowledge of initial status; is conceptually congruent with, and is an unbiased estimator of, the true gain score; does not sum to zero by construction for the group; places the pre-test AND post-test on equal footing as outcomes, thus generalizes directly to growth modeling.  In contrast, the residual gain score ranks on only “relative progress,” allowing for “adjusted comparisons,” but is by no means “corrected” for anything in particular; an individual’s gain is dependent on who else is included in, or excluded from, the regression, and as such makes gains measurement subject to manipulation; sums to zero for a group and thus severely limits its utility for representing overall change; violates regression requirement that pre-test are error-free; does not generalize easily to longer time series. Additionally, expanding on the conceptual congruence of the gain score with true gain, note how the gain score is ALSO the ideal for supporting causal claims under the widely-considered Rubin-Holland counterfactual framework. In the gain score, we do not need to guess the result in the unobserved “counterfactual” condition!!!!!!! 6

8 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 7 Overall Strategy: Obtain a good fitting (measurement) model for each school (surface for math), then construct and evaluate relevant valued-added hypotheses for the school. Year Grade Outcome Note: Surface need not be a “flat.”

9 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 8 So, there are ONLY Value-added Hypotheses, NOT Value-added Models! è It is up to us to define What Progress Are We Talking About? è How-to: Get the best data available, smooth it for irregularities with the most reasonable model, and construct from the “signal,” statistics that address your hypotheses directly.

10 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 9 Basic “Progress Hypotheses” 1998 1999 2000 2001 2002 Grade 5 4 3 2 1 (Q1) Cohorts (Q2) Grade-level Means (Q3) Grade-level PACs

11 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 10 Criterion AND Norm Referencing: dual reporting formats for two questions about your achievement Norm Referencing NRT (NCE) Reporting Analysis / Reporting Criterion Referencing Scale Scores Grade 3 L 1 Basic L 3 Advanced L 2 Proficient C’ 1 C’ 2 C1C1 C2C2

12 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 11 Some Value-Added Hypotheses Examples with School as the Accounting UnitBased-on Q1 Change over time in Cohort Growth Rates Estimated Cohort Growth Rates Inter-Cohort Contrast of Cohort Growth Rates Comparison to Proposed External or System Standard* Q2 Estimate Grade-level Growth over Time Predicted Grade-level Means Inter-Grade Contrasts of Grade-level Growth Rates Value-Added over Projected Status Total Output over time: Combining Initial Status & Growth Rate** Comparison to Proposed External or System Standard Q3 Change over time in Grade-level PACs Predicted Grade-level PAC’s Inter-Grade, Between Year, Contrasts of Grade-level PACs Comparison to Proposed External or System Standard * An example is the 100% Proficient is a standard for NCLB. Other examples may compare schools with each other, with “similar” schools determined by ranking on a selected covariate set (ala California), etc. ** Thum & Chinen (in preparation)

13 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 12 Standards of Progress  To fully judge progress, we rely on standards, or benchmarks, absolute and contextualized, whenever these are available.  Within EACH school (over time), we might consider progress of Different subjects, or their composites Different grades, or their aggregates (lower primary, etc.) Different student cohorts, or their comparisons Different sub-groups  All the above may be individually, or in groups of school- grades, compared with District average, schools-like-mine, etc. Fixed district goals.

14 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 13 Score 98 99 00 01 02 Year Decreasing Productivity 98 99 00 01 02 Year Increasing Productivity Comparing Cohort Slopes: Improvement (Q1)

15 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 14 Is School 201 Improving? Cohort Regressions (Q1)

16 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 15 Is School 201 getting more effective? (Q1) This compares present with past performance. We can also compare School 201’s latest growth rate with the district average, with the average of schools “similar” to School 201.

17 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 16 What is Adequate Yearly Progress? Example, via an Empirical Definition (Q2) è AYP must take into account n Where you start and Where you should end up (mandated) n Between the present time, t, and the mandated time frame to reach proficiency (T = 12) è Thus, AYP may be defined as the growth rate that will place you on the target given where you are presently, such as (Y T -Y t ) / (T- t), or a some more refined version; where Y T is the cut-score for the “proficiency” and Y t is the present score. DOES NOT MEAN THE ANALYSIS NEED TO BE PERFORMED ON CATEGORICAL DATA

18 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 17 1 2 3 4 5 Year 1 2 3 6 4 5 Grade Score 800 700 600 500 Is Grade 1 Predicted Average Increasing? Is Grade 4 Predicted Average Increasing? Predicted Grade-Year Means (Q2) Based on a model for the information contained in the data- block …

19 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 18 Improvement by Grade (Q2)

20 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 19 CLCL CUCU L 1 Basic L 3 Advanced L 2 Proficient X, Time Y, Mean T Object of Inference: Lower bound of School’s AYP for Time=4. Upper bound of School’s AYP for Time=4. Assessing AYP-NCLB

21 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 20 Defining AYP-NCLB Question: Given where you are at this point in time, are you improving at a pace that will put you on the specified target in the remaining time frame? Implication: AYP depends on the performance of the school; so it changes over time. Classification errors are directly assessed. Answer: If you are growing at at time t, your minimum growth rate to reach the target is, and so you make AYP-NCLB if, with probability.

22 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 21 Making AYP-NCLB (Q2)

23 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 22 T CLCL Not Proficient Proficient X, Time Y, Score Trend in Percent Proficient (Q2) Object of Inference: SAFE HARBOR School makes AYP If % proficient increased by 10 %

24 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 23 Value-Added over Projected Status T Object of Inference: CLCL CUCU L 1 Basic L 3 Advanced L 2 Proficient X, Time Y, Mean (Q2) Some standards: for school, district, schools-like-mine.

25 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 24 Total Output: School Excellence as A Value-Added Hypothesis Object of Inference: (Q2) A B C D Areas under predicted curves, f(x) ! Comparing 4th grade growth for schools, j = A, B, C and D, that combines Growth AND Final Status! ABCDABCD X, Time Y 1T

26 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 25 Why Bayesian Inference  Basic Components See O’ Hagan (1994) for a summary. Basically formulated here as an enhancement of likelihood inference.  Highlighted Advantages Conceptual – Credibility Intervals as likely range of of true parameters is the more natural vis-à-vis Neyman-Pearson Co.I. Analytically less demanding, using statistics to do statistics via Markov chain Monte Carlo (MCMC), inference for ratios is straightforward.  Disadvantages Where do get out priors – not a problem (for long anyway) with longitudinal data. Computationally intensive, and in normally large accountability applications we need to proceed carefully.

27 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 26 Ratios & Productivity Profiles Thum (2003b) Posterior Distributions of a Value-added indicator,, for 3 Schools. Productivity Profiles: Result: A measure of how much was achieved (a percent) and at what level of precision (a probability), and so the comparison is (relatively) scale-free.

28 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 27 Sample Teacher Productivity Profiles I  l, Proportion of Standard, % Confidence in Meeting  l  %  % We are only confident (at 70% level) that 3 teachers reached 4% A

29 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 28 Sample Teacher Productivity Profiles 2 70% 80% Models differ in terms of adjustments for different classroom characteristics.

30 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 29 Sample Teacher Productivity Profiles 3 Different Models produces different conclusion (Thum 2003b) Model 0 70% 80% Model 1 70% 80% Model 4 70% 80%

31 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 30 Standing Issues re Inputs: Validity & Quality of Outcome Measures 1. We assume that we have an outcome of student learning which the user believes to be a valid/useful measure of the intended construct. 2. The outcome measure possesses the necessary psychometric (scale) properties supporting its use. 3. To the degree that either, or both, the construct validity of the measure, and its scale-type (interval), are approximate in practice, we submit that the validity of the interpretation using this outcome needs to be tempered accordingly. 4. Faced with this complex of nearly unsolvable issues, I find myself resting some of my choices on the “satisfising principle” (Simon, 1956).

32 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 31 Selected References Thum, Y. M. (2002). Measuring Student and School Progress with the California API. CSE Technical Report 578. Los Angeles: Center for Research on Evaluation, Standards, and Student Testing, UCLA. Thum, Y. M. (2003a). No Child Left Behind: Methodological Challenges Recommendations for Measuring Adequate Yearly Progress. CSE Technical Report 590. Los Angeles: Center for Research on Evaluation, Standards, and Student Testing, UCLA. Thum, Y. M. (2003b). Measuring Progress towards a Goal: Estimating Teacher Productivity using a Multivariate Multilevel Model for Value- Added Analysis. Sociological Methods & Research, 32 (2), 153-207. Acknowledgements Acknowledgements The analyses presented here are drawn from a larger comparative analysis study organized and supported by the New American Schools. Additional illustrations concerning the API draw support from CRESST and the Los Angeles Unified School District. Many of the ideas were first tested in an evaluation sponsored by the Milken Family Foundation. Portions of this presentation were part of an invited presentation in AERA 2005, Montreal. Y. M. Thum thum@msu.edu

33 Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 32 “Too much trouble”, “too expensive”, or “who will know the difference” are death knells to good food. Julia Childs (1961) Final Caveat: In this work, the procedures are complex only to the degree that they meet the demands of the task at hand – nothing more, nothing less. We have clearly come a long way from naively comparing cross- sectional means.


Download ppt "Y. M. Thum / M S U Y. M. Thum / M S U Nov 2005, Bayesian Inference for Some Value-Added Indicators 1 Measurement & Quantitative Methods Counseling, Educational."

Similar presentations


Ads by Google