Download presentation

Presentation is loading. Please wait.

Published byKaleigh Skipwith Modified over 2 years ago

1
The Effective Use of Some School-level Indicators of Student Learning Growth: NWEAs Learning Productivity Measurement (LPM) System Y. M. Thum, Ph.D. Sr. Research Fellow Northwest Evaluation Association Portland, Oregon 12 th Annual Maryland Assessment Conference Riggs Alumni Center, University of Maryland, College Park, MD 20742 October 18-19 th 2012

2
Some Important Cross-cutting Concerns o What should the focus be, i.e. Goals? – Attention from Educational Quality to Learning Progress – De-emphasizing Achievement, to favor Learning Growth, and BACK. – Characterizing: Effectiveness, Improvement, or Productivity? o Given the above, what data are available in support ? – Do the available data support the measurement of the quantities of interest? o Model needs to match the purpose of the information. – Monitoring for Accounting vs. Accountability: Is a trait or state assumption being implied – how much reliability/stability should one anticipated? – Analysis is Description vs. Causal (value-added) : Is the leap from how your students performed to how you have helped your students performed too often made? – Realize that indicator/ranking/metric does not, by itself, produce change – unlike incentive systems (e.g. Tax code) Model Introduction 10/9/2012Y M Thum - Modeling Growth2 Goal Data Now you have a shot!

3
Outline of the Presentation Specific Goals of the Presentation o Comments on some persistent issues for measuring growth – Interval scale and impact on rankings, etc. – Dimensionality, construct shift – How growth is defined, implicitly, by the choice of growth models – Residualized gains, growth, vs. value tables – What result, or pattern of results, is hypothesized? o An example of How Growth is being defined and measured – NWEAs Measures of Academic Progress, K-12 – Achievement and Growth Norms, conditional on a students instruction time o An example of School Productivity Measurement – Analyses focus on the historical data for a school and address specified questions/hypotheses about growth and productivity patterns based on the data. – Examples are offered regarding its use for district and school monitoring and data dialogues activities. 10/9/2012Y M Thum - Modeling Growth3

4
How to Classify Growth Models Difference over TIME seems key to Growth – 3 species Residualized gains model – Anytime the outcome sits alone on the LHS – Time is implicit, sometimes absent when variation in time is zero (e.g., a cross-sectional post-on-pre regression) – Growth is defined as a residual against an estimated baseline or base-plane – Colorado Growth Model, Rob Meyer, etc. Value Tables – Thum (2002) pointed out that this is an index scoring approach – Nominating worth scores, disregards scale – Properties not generally clear (Conventional) Growth Curve Model – Outcomes are all on the LHS, all treated as stochastic, heteroscadesticity treated easily – Regression of observed gains is a simple (and crude) example – Sanders & Horn (1997), Bryk et al. (2003), Thum (2002~2006) – Growth is intra-individual, repeated measures, variation btw students, schools, etc. The above growth models are core, amendable to the introduction of covariates – i.e., models for value-added analyses. 10/9/2012Y M Thum - Modeling Growth4

5
Vertical Scales, Growth Models, Valued-added Some sprinkling … Briggs & Weeks (2009) – Constructed 8 vertical scales based on real item response data, varying IRT model, calibration and estimation approaches – Applied Sanders layered model for school effects to scores – Finds that while the ordering of school effects estimates is insensitive to the underlying vertical scales Only the precision of such value-added estimates seem to be sensitive to the combinations of choices made in the creation of the scale. Shafer et al. (2012) compared many growth models – No real differences, see also Tekwe (2004). Lockwood et al. (2007, JEM) – Teacher value-added estimates are more sensitive to the choice of a student achievement measure than to differences in value- added model for the same achievement measure. 10/9/2012Y M Thum - Modeling Growth5

6
Uni-dimensionality and Measurement Multi-dimensionality – No-one disagrees: Nothing is likely to be 1-dimensional in reality – A wonderfulous quest, and we have seen how it may be more challenging in MEASUREMENT then in ANALYSIS – However, uni-dimensionality IS part of the core proposition in purposive scaling Shirt-sizes example – Construct shift (Martineau, 2006) Are we mistaking changes in content for shifts in construct? Which is also understandable but that is not what measuring aims to do, I believe. Not an argument against our multi-dimensional world; just being focused on the problem at hand. – Separate scaling of variables FROM their analyses 10/9/2012Y M Thum - Modeling Growth6

7
A Curious Problem with Value Tables Transition matrix, value tables Committee nominates/imposes values to express the worth of difference types of progress observed Ignores scale, arbitrary, even absolute, disregards any potential for measurement/classification errors Can be problematic! Value tables DEFINE Growth rather than EVALUATE it. See e.g. from Shafer et al. (2012) next. Thus NOT model-free, but may still be amendable to serious analysis, e.g., Thums (2002b) work on the California Academic Performance Index (API). 10/9/2012Y M Thum - Modeling Growth7

8
Value Tables DEFINE, not EVALUATE The behavior of TProg is hailed as the one with enough reliability and stability for teacher HIGH-STAKES assessment applications, but it represents not as much growth as Y2 status (and nothing from Y1 as believed). 10/9/2012Y M Thum - Modeling Growth8

9
NWEAs Growth Norms: How Growth is Defined and Measured 10/9/2012Y M Thum - Modeling Growth9

10
A stable and reliable vertical scale across the relevant (narrow) grade-range Students are measured uniformly well across the range – variance of SEMs relatively low over theta Precision of scores is provided, will be used Scores are criterion referenced, with performance norms, to provide clear interpretation NWEAs MAP possesses all the above qualities. Fall and Spring testing additionally offer a huge design advantage over once-a-year assessment programs Test and Scale Requirements 10/9/2012Y M Thum - Modeling Growth10

11
Measure Status, Infer Growth Growth is observed, an inference about a change on a scale for status over time; not an independent metric (Thum & Hauser) Methodological Approach – Multilevel polynomial growth model – Latent variable regression of (latent linear) Gain coefficient on (latent) Initial Status coefficient – Make inference of an Observed Gain wrt the appropriate Conditional Gain Distribution Result – Condition Growth Index (CGI) – Conditional Growth Percentile (CGP) 10/9/2012Y M Thum - Modeling Growth11

12
Spring Grade 2 Fall Grade 3 Winter Grade 3 Spring Grade 3 Fall Grade 4 Achievement over Time Centered Instructional Days (Weeks) 10/9/2012Y M Thum - Modeling Growth12

13
Spring Grade 2 Fall Grade 3 Winter Grade 3 Spring Grade 3 Fall Grade 4 Conditional Achievement Norms 10/9/2012Y M Thum - Modeling Growth13

14
Trajectory of Status over Time 1.Achievement norms are specific to the instructional day! 2.Growth norms are conditional on how much instruction students have had between assessments how well they performed at baseline Academic peers – same test, amount of instruction, baseline performance! 10/9/2012Y M Thum - Modeling Growth14

15
Observed and Inferred Growth 10/9/2012Y M Thum - Modeling Growth15

16
Conditional Growth Norms Achievement Norms Growth Norms 10/9/2012Y M Thum - Modeling Growth16

17
Additional Facts/Features Short-term gain is measured on a within-grade scale Instructional time within a grade is the time dimension Scores weighted for precision, using their standard errors of measurement Post-stratification weighting to reflect nationally representative student body In this procedure, we measure achievement status, and not growth. Theoretically, growth is observed and to assess its true magnitude requires an inference, a result that is based solely on an acceptable descriptive model of change in achievement status over time. The CGI is – not absolute, not relative, but a normative measure of change – facilitates aggregation over students, subjects, grades, etc. 10/9/2012Y M Thum - Modeling Growth17

18
NWEAs Learning Productivity Measurement (LPM) System: Assessing a Schools Academic Health 10/9/2012Y M Thum - Modeling Growth18

19
Learning Productivity Measurement The LPM System provides a set of perspectives into the learning growth of the students in your school based on their individual MAP performance histories. See Thum (2003, 2006). Describes and compares the trends of both – achievement status, as well as – achievement growth. Presents results as evidence summaries to facilitate data- dialogue/client discussions about – how student learning in your school has changed over time – was there discernable change, if so where, i.e., which grade-level? when, i.e., is the change recent? 10/9/2012Y M Thum - Modeling Growth19

20
Multilevel Multi-Cohort Growth Model Overview: LPM System Flow-chart School j Data Block Estimated Growth Rates Contrasts of Age- cohort Growth Rates Predicted Grade-Year Means Predicted Grade-Year Means Predicted Grade-Year Means Predicted Grade-Year Means Contrasts of Predicted Grade-Year Means Growth Reports T Tests Age-Cohort growth patterns Tests Grade- Cohort growth patterns Growth pattern norms Compare schools Compare to similar schools Compare districts Predicts status & gains for following year Validation of predictions 10/9/2012Y M Thum - Modeling Growth20

21
Assumptions re Inputs: Validity & Quality of Outcome Measures o We assume that we have an outcome of student learning which the user believes to be a valid/useful measure of the intended construct. o The outcome measure possesses the necessary psychometric (scale) properties supporting its use. o To the degree that either, or both, the construct validity of the measure, and its scale-type (interval), are approximate in practice, we submit that the validity of the interpretation using this outcome needs to be tempered accordingly. o Faced with this complex of nearly unsolvable issues, I rest some of my choices on a satisficing principle (Simon, 1956). 10/9/2012Y M Thum - Modeling Growth21

22
School A Mathematics Data 200420052006200720082009 FallSpringFallSpringFallSpringFallSpringFallSpringFallSpring Grade 445566778899 5 Cohort N RIT 105103122123119123110111141143119125 214.21219.19216.56223.24219.14222.71219.79225.14214.4222.59221.19231.21 %54 57 64 65 54 67 N Gain 101 119 117 106 138 116 4.77 6.83 3.86 5 8.03 10.79 Grwth Idx -1.41 0.69 -2.23 -1.07 1.85 4.74 CGI -0.23 0.11 -0.36 -0.17 0.3 0.77 % 43 53 41 46 57 71 4 Cohort556677889910 N RIT 114109122123103106118126123122138142 206.03216.68207.36218.12209.38217.12205.94213.61206.46220.31207.81221.52 %57 60 64 56 59 60 N Gain 104 118 99 117 114 134 10.12 10.71 8.17 7.71 13.5 14.39 Grwth Idx 3 3.65 1.2 0.56 6.43 7.34 CGI 0.52 0.63 0.21 0.1 1.11 1.26 % 63 65 55 52 76 77 3 Cohort6677889910 11 N RIT 109 110107118117 125128124122 196.81205.28197.8208.17195.47203.36195.53204.58195205.9193.96204.4 %62 64 56 54 N Gain 103 115 113 121 118 8.01 10.16 7.68 9.01 11.02 10.08 Grwth Idx -1.42 0.81 -1.83 -0.52 1.47 0.49 CGI -0.24 0.14 -0.31 -0.09 0.25 0.08 % 42 54 41 46 55 51 2 Cohort77889910 11 12 N RIT 100102129126115116 118117120122124 183196.69180.05195.3183.19196.48181.98195.7176.99194.93179.49199.84 %61 54 61 57 43 50 N Gain 98 120 111 113 111 115 13.67 14.94 13.21 13.69 17.93 21.05 Grwth Idx 2.85 3.89 2.41 2.78 6.59 9.91 CGI 0.48 0.65 0.4 0.47 1.11 1.67 % 62 66 59 61 74 85 10/9/2012Y M Thum - Modeling Growth22 Overlap in data IS natural in learning Information ballast imparts stability, standard for evidence Highly specific questions being asked, and addressed School-by-school analysis, very much an Age-Period-Cohort analysis. Overlap in data IS natural in learning Information ballast imparts stability, standard for evidence Highly specific questions being asked, and addressed School-by-school analysis, very much an Age-Period-Cohort analysis.

23
Your Schools Performance: A Holistic Take 2 3 4 5 Grade RIT F S 2006 2007 2008 2009 School Year Age-cohort Students in the same age-cohort share a relatively homogenous school experience in terms of their peers and teacher group. Grade-level Groups of students attending the same grade-level are exposed to the same teachers and materials. 10/9/2012Y M Thum - Modeling Growth23

24
A Pictorial Guide to How LPM System Works 10/9/2012Y M Thum - Modeling Growth24

25
1 2 3 4 5 Year 1 2 3 6 4 5 Grade Score 800 700 600 500 1 2 3 4 5 Year 8765432187654321 Grade Cohorts of Longitudinal Student Data Cohort 5 10/9/2012Y M Thum - Modeling Growth25

26
1 2 3 4 5 Year 1 2 3 6 4 5 Grade 1 2 3 4 5 Year 1 2 3 6 4 5 Grade Score 800 700 600 500 Cohort Regressions Individual Growth Curves of Students in Cohort 5 10/9/2012Y M Thum - Modeling Growth26

27
1 2 3 4 5 Year 1 2 3 6 4 5 Grade Score 800 700 600 500 Observed Grade-Year Means 10/9/2012Y M Thum - Modeling Growth27

28
Overall Strategy: Obtain a good fitting (measurement) model for each school (surface for math), then construct and evaluate relevant valued-added hypotheses for the school. Year Grade Outcome Note: Surface need not be a flat. 10/9/2012Y M Thum - Modeling Growth28

29
Comparing Age-cohort Growth Rates 2 3 4 5 Grade F S 2006 2007 2008 2009 School Year Age-Cohort B Growth Trend Age-Cohort A Growth Trend Age-Cohort C Growth Trend An Sample Question about the Pattern of Age-cohort Growth : Fall Achievement growth rate for Age-cohort A is stronger than the average of the Fall achievement growth rates of Age-cohorts B and C. Note: Age-cohort A is the most recent age-cohort in this analysis. 10/9/2012Y M Thum - Modeling Growth29

30
What Being More Productive over Time Means F S 2006 2007 2008 2009 School Year Evidence: More Recent Age-Cohort Slopes are Stronger Conclusion: School is increasingly more productive Evidence: More Recent Age-Cohort Slopes are Weaker Conclusion: School is becoming less productive 2 3 4 5 Grade F S 2006 2007 2008 2009 School Year 10/9/2012Y M Thum - Modeling Growth30

31
Patterns of Fall Achievement Trends Your School Other Schools Your District Schools like yours Legend Reports in Progress: e.g., Band-Aid Graph If Observe: Slopes Decrease over Time Conclude: Decreasing Productivity If Observe: Slopes Increase over Time Conclude: Increasing Productivity Pattern of Growth Rates RIT Year Stronger RIT Year Weaker (a) (b) 10/9/2012Y M Thum - Modeling Growth31

32
1 2 3 4 5 Year 1 2 3 6 4 5 Grade Score 800 700 600 500 Is Grade 1 Predicted Average Increasing? Is Grade 4 Predicted Average Increasing? Predicted Grade-Year Means Based on a model for the information contained in the data- block … 10/9/2012Y M Thum - Modeling Growth32

33
Fall Achievement Fall-Spring Gains C22 GG1 GG2 C41 C42 GG3 GG4 GG5 Evidence Supports Does Not Support Pattern/Trend Questions Considered 10/9/2012Y M Thum - Modeling Growth33

34
Corresponding Hypothesis Tests Fall Achievement Fall-Spring Gains C22 GG1 GG2 C41 C42 GG3 GG4 GG5 Evidence Supports Does Not Support 10/9/2012Y M Thum - Modeling Growth34

35
Sample Results– Peterson ISD The School Challenge Index A new measure of how schools in a state compare in terms of the challenges and opportunities they operate under as reflected by an array of factors they do not control. This school measure generally taps the collective economic circumstance of its students, but it also offers a broader view of the economic strain they experience as seen through a relevant set of socio-demographic, organizational, and educational policy programming factors. The School Challenge Index A new measure of how schools in a state compare in terms of the challenges and opportunities they operate under as reflected by an array of factors they do not control. This school measure generally taps the collective economic circumstance of its students, but it also offers a broader view of the economic strain they experience as seen through a relevant set of socio-demographic, organizational, and educational policy programming factors. Age-cohort C22 patterns appear to be stable. Age-cohort C42 patterns appear to be stable. Age-cohort C41 patterns appear to be stable. Age-cohort Pattern of Fall Ach. Slopes Age-cohort Pattern of F-S Gain Slopes Age-cohort Avg. of F-S Gain Slopes 10/9/2012Y M Thum - Modeling Growth35 District and School names are completely fictitious. Any resemblance to actual districts or schools, healthy or otherwise, is purely coincidental.

36
More Recent Fall Ach Slopes are Stronger Mathematics Percentage of NWEA schools in your State with Stronger, Same, or Weaker Growth Rates of Fall Scores for their more recent Age-cohorts: 2007, 2008, and 2009 % Mathematics 10/9/2012Y M Thum - Modeling Growth36 Age-cohort Pattern of Fall Ach. Slopes

37
10/9/2012Y M Thum - Modeling Growth37

38
Summing Up We noted, and asked questions about, several persistent issues with growth modeling and accountability. I hope to hear your objections, in order to better assess if I have been adequately provocative. In presenting the NWEA norming procedure, we have presented a definition of growth that is novel; of someone who runs around with a rule he is happy with. In presenting the LPM System, we have returned the focus of accountability to an internal dialogue about how a school has been improving, in effectiveness and in productivity – which is where I think data, rendered well, will be most helpful to the improvement of schools and learning. Thank you! 10/9/2012Y M Thum - Modeling Growth38

39
References 1.Betebenner, D. W. (2008). Toward a normative understanding of student growth. In Ryan, K. E. and Shepard, L. A., editors, The Future of Test-Based Educational Accountability, 155–170. Taylor & Francis, New York. 2.Briggs, D. C., & Weeks, J. P. (2009). The Sensitivity of Value-Added Modeling to the Creation of a Vertical Score Scale. Education Finance and Policy, 2009, 4, 384-414. 3.Martineau, J. A. (2006). Distorting value added: The use of longitudinal, vertically scaled student achievement data for growth-based value-added accountability. Journal of Educational and Behavioral Statistics, 31(1), 35-62. 4.Schafer, W. D., Lissitz, R. W., Zhu, X., Zhang, Y., Hou, X., & Li, Y. (2012). Using Student Growth Models for Evaluating Teachers and Schools. (Technical report). College Park, MD: MARCES. 5.Simon, H. A. (1957). Models of man: Social and rational. New York: Wiley. 6.Thum, Y. M.(2002a). Measuring student and school progress with the California API. CSE Technical Report No. 578. Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards, and Student Testing, UCLA. 7.Thum, Y. M. (2002b). No Child Left Behind: Methodological Challenges & Recommendations for Measuring Adequate Yearly Progress. CSE Technical Report 590. Los Angeles: Center for Research on Evaluation, Standards, and Student Testing, UCLA. 8.Thum, Y. M. (2003). Measuring Progress towards a Goal: Estimating Teacher Productivity using a Multivariate Multilevel Model for Value-Added Analysis. Sociological Methods & Research, 32 (2), 153-207. 9.Thum, Y. M. (2006). Designing Gross Productivity Indicators: A proposal for Connecting Accountability Goals, Data, and Analysis. In R. Lissitz (Ed.), Longitudinal and Value-Added Models of Student Performance. (pp. 436 - 479). Maple Grove, MN: JAM Press. 10.Thum, Y. M., & Bowe, B. (2009). An Effect-Size Indicator for Measuring the Productivity of a Teacher based on the Learning Growth of her Students with Applications. Unpublished report, National Heritage Academies. 10/9/2012Y M Thum - Modeling Growth39

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google