Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unpublished Work © 2005 by Educational Testing Service Growth Options for California County and District Evaluators’ Meetings May 10 and 19, 2005.

Similar presentations


Presentation on theme: "Unpublished Work © 2005 by Educational Testing Service Growth Options for California County and District Evaluators’ Meetings May 10 and 19, 2005."— Presentation transcript:

1 Unpublished Work © 2005 by Educational Testing Service Growth Options for California County and District Evaluators’ Meetings May 10 and 19, 2005

2 2 Californians Want to Measure Student Growth CST scales are separate by grade Each grade has its own Basic (300) and Proficient (350) standards Connections do not presently exist between grades

3 3 “Measuring growth” can mean different things to different users “Vertical scaling” Catch-all phrase used by a variety of people to represent growth measures A technical term for one particular statistical procedure May or may not be most useful and cost- effective growth measure needed by CA Today we will explain options for measuring growth and get your input

4 4 Progress Toward Determining the Best Growth Measure(s) for CA Exploratory study of vertical scaling of CSTs Technical Advisory Group Interviews of CA school district staff about what growth measures would be useful Growth Options Task Force Evaluators’ meetings Growth Options Task Force follow-up

5 5 Vertical Scaling (Technical definition) Connect the scales across grades by having students take “linking” items from adjacent grade tests These links place the items (and scores) across grades on a common scale Scale scores might range from 200 (grade 2) up to 800 (grade 11)

6 6 Vertical Scaling Ideal goals: Scale scores increase by grade Scale scores can be compared across grades A 500 “means the same thing” if it comes from a grade 4 test or a grade 5 test “Growth” of 10 units “means the same thing” in low grades as high grades Ideal approximated in real life but never exactly met Vast majority of vertical scales have been developed with published norm-referenced tests Few vertical scales exist for state standards- referenced tests

7 7 Exploratory Vertical Scaling Study for California ELA grades 2-11 Math grades 2-7 Linking embedded in 2004 operational CST testing No incremental testing or cost to state Linking items Measured standards that were common across adjacent grades Placed in “field test buckets”

8 8 Design N=3000 to 5000 per linking item ELA 17-25 linking items per grade pair Math 18-24 linking items per grade pair Grade 2 students took some grade 3 items and grade 3 students took some grade 2 items, etc. Scales linked sequentially: 2<3<4<5<6<7<8<9<10<11

9 9 Evaluation of Links Evidence that supports the validity of vertical scaling is the growth of student scores Better performance of higher-grade students than lower-grade students on common items Scale score distributions that increase as grade increases

10 10 Findings: Higher-grade students consistently did better than lower-grade students on common items that came from the higher-grade operational test Higher-grade students did not necessarily do better than lower-grade students when common items were from the lower-grade operational test Position effects were evident: items became more difficult when they appeared later in a test

11 11 Findings (cont.): Scale scores generally increased by grade except ELA: grades 9, 10, 11 minimal growth Math: grades 6 and 7 essentially no growth

12 12 Conclusions of exploratory study Concerns: ELA: Minimal growth in grades 9, 10, and 11 Math: Minimal growth in grades 6 and 7 Possible factors affecting vertical links Item position effects Grade x curriculum interactions Changes in populations Not clear if vertical scaling will work for CSTs at all grades

13 13 Phone Interviews March/April 2005 15 respondents from CA counties and districts Asked 5 questions

14 14 Are you currently using STAR data to make any longitudinal comparisons, and if so, what are you doing with that data? Used NRT or CST Aware of inappropriateness of using current CST scale scores for growth

15 15 Who are the most important potential users in your district of longitudinal information? Full range: Teachers to Superintendents Parents School Boards Administrators: instructional planning Teachers: expected student performance

16 16 If we were able to improve the psychometric underpinnings for making comparisons across grades using CSTs, would that be of benefit to your district? How would you plan to use that information? Overwhelming enthusiasm for legitimate method of making longitudinal comparisons Should provide legitimate procedure so users don’t “hurt themselves” Concern about over-burdening the CSTs by addition of one more purpose

17 17 Longitudinal comparisons do have their limitations and can be misinterpreted, so we’d like to get your input on what interpretive materials would be most useful to you. Current post-test workshops and guides should cover this Few saw need for special efforts Largest districts have resources to address this Teacher-specific interpretive materials would be helpful

18 18 One of the options we are considering is a vertical scale. If we used a vertical scale, there would be some changes, and we would need to have an in-grade scale that differed from an “across- grade” scale. Would that be a problem in your district? Two diametrically opposed opinions: Acquired meaning of 300 and 350 too important to do away with The meaning of 300 and 350 could be easily supplanted Use of both in-grade and across-grade scales seen as complicated and potentially confusing

19 19 Growth Options Task Force Tom Barrett, Riverside USD Paula Carroll, Lodi USD J.T. Lawrence, San Diego COE Phil Morse, LAUSD Jim Parker, Paramount USD Jim Stack, SFUSD Mary Tribbey, Butte COE Mao Vang, Sacramento City USD

20 20 Major Options for Tracking Growth Vertical Scales Norms Tables of Expected Growth

21 21 Vertical Scales Advantages Scale scores comparable across grades Useful if tracking students across many grades Suitable for statistical analyses

22 22 Vertical Scales Disadvantages Assumption of hierarchical growth maybe not met; scores may not grow between grades Across-grade scale different from within-grade scale Can highlight inconsistencies (if they exist) of with-in grade standards Scale scores have no intrinsic meaning Need caution in comparing growth in different parts of scale Special data collection needed

23 23 Norms CA percentiles, NCEs, or Z-scores By grade by content area “Typical” growth defined to be what is seen cross-sectionally in state from grade to grade Types Static (using a base year such as 2003) Rolling (using current year)

24 24 Norms Advantages Fairly easy to understand Allow comparisons of relative standing and growth relative to norm group Minimal assumptions are required Comparisons can be made across content areas No special data collection needed

25 25 Norms Disadvantages Need to keep clear relative nature of comparison (static vs rolling norm) No continuous growth scale Growth expectations are based on cross- sectional, not longitudinal data “Typical” growth does not necessarily mean student is progressing sufficiently toward Proficiency

26 26 Tables of Expected Growth Use longitudinal CA data (e.g., grade 3 and 4 performance for the same students) Determine statistical expectation of grade 4 scores typically seen for students with each possible grade 3 score Calculate standard error along with expectation Standardized deviations from expectations can be compared across grades and content areas

27 27 Tables of Expected Growth Advantages Fairly easy to understand Allow comparisons of growth relative to norm group Minimal assumptions are required; could be done for high school courses Comparisons can be made across content areas Based on actual student growth

28 28 Tables of Expected Growth Disadvantages Tables of expectations may need to be recalculated each year No continuous growth scale “Typical” growth does not necessarily mean student is progressing sufficiently toward Proficiency Matching student data over years required Expectations would not include students who have been in CA < 1 year or who cannot be tracked

29 29 Growth Options Task Force Discussed options in detail for a day Norms may be most easily understood Growth Expectations may be most useful for administrators and program evaluation Classification may be useful: Growth is average/above average/below average Standardized growth measures that could be pooled over grades could be useful: (Observed score – Expected score)/SE Will work with CDE and ETS to pilot test some options


Download ppt "Unpublished Work © 2005 by Educational Testing Service Growth Options for California County and District Evaluators’ Meetings May 10 and 19, 2005."

Similar presentations


Ads by Google