Presentation on theme: "Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson."— Presentation transcript:
Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson
Teacher opinion A recent international survey of teachers shows: --That the vast majority of teachers welcome appraisal and feedback on their work. --That it improves their job satisfaction and effectiveness as teachers. --But too many teachers do not receive any feedback on their work at all. --Moreover, evaluation is perceived to be an instrument of compliance rather than development.
Teacher ratings Most school districts use pass-fail ratings where nearly all teachers pass. 99% of teachers in districts using binary ratings are rated satisfactory. 94% of teachers in districts using multiple points are in the top two categories. As Arne Duncan noted, Ninety-nine percent of our teachers are above average.
Teacher salaries Teacher compensation is very predictable. Based on the teachers highest degree and years of seniority. Almost completely unrelated to variations in teacher effectiveness.
Effectiveness varies Anecdotal and empirical evidence suggests that teachers differ dramatically in effectiveness. An effective teacher will raise student test scores by ten percentiles per year. Three years of effectively teachers raise test scores by thirty percentiles. Traditional teacher evaluation systems fail to recognize these differences.
Teacher recognition The need to recognize teachers who make magnificent contributions to student learning. The need to motivate people to gain expertise. And the need to leverage expert teachers and reward them for their efforts. To ensure that students are taught successfully, there is need to differentiate teacher effectiveness in terms of their impact on student learning.
Status, growth and effectiveness Student achievement is the status of accumulated subject matter knowledge at one point in timea lagging indicator. Student learning is growth in subject matter knowledge over timea leading indicator. It is student learningnot student achievement that is most relevant in defining and assessing teaching effectiveness.
Status, growth and effectiveness Achievement provides evidence of the status of student knowledge and understanding at one point in time. Learning is demonstrated by growth in student achievement from one point in time to another point in time–not by status at either point time alone. Effectiveness is demonstrated by above-average student learning and growth.
Status, growth and effectiveness Schematically: Status = Achievement Growth = Learning Relative Growth = Effectiveness
Why growth? Growth reflects learning, and we care about student learning. Because the principle role of teachers is to enhance student learning. Teacher effectiveness should be reflected in how much their students learn.
Official incentives Teacher Incentive Fund (TIF) grants require school districts to evaluate teachers. Race to the Top (RttT) funds require a state commitment to measuring teacher effectiveness. No Child Left Behind (NCLB) required testing of all students in reading in mathematics, leading to the development of longitudinal data systems linked to individual teachers.
Student testing Most states have test data linked to specific schools and teachers that can be used to track student growth. Many assessment systems are based on student test score growth over time: Value-added models Student growth percentiles Both address effectiveness in terms of learning rather than status.
Value-added assessment Value-added models are designed to assess school and teacher contributions to student growth. A value-added assessment model is designed to demonstrate the impact of individual schools and teachers. It is designed to distinguish between teacher effects and other outside influences.
Value-added assessment Value-added captures the growth that classes of students achieve during a single year of schooling. To estimate classroom effects, student data include only the students enrolled in a particular class.
Value-added assessment Key idea is to statistically isolate the contribution of individual teachers from all other sources of influence. Value-added analyses attempt to determine the amount of student growth that can be attributed to an individual teacher. Value-added models quantify teacher effectiveness the teachers contribution to student learning and growth.
Value-added assessment Value-added attributes causality to the teacher. Teachers are responsible for the learning and growth of their students. Under conditions of high stakes accountability, student growth has been directed toward cause and responsibility.
Value-added assessment Some statisticians would argue that value-added unsuited for drawing causal inferences that a given teacher is responsible for the increase in student test scores. We do not think that their analyses are estimating causal quantities, except under extreme and unrealistic assumptions. –Rubin, Stuart, and Zanutto (2004). …it does not appear possible to separate teacher and school effects using currently available accountability data. –Raudenbush (2004).
Value-added assessment Policymakers and school administrators generally express no such reservations and offer strong support for the value-added. If quality instruction is essential for student learning, then student learning should tell us something about the quality of instruction.
Descriptive accountability Accountability system results may have value without making causal inferences. From this perspective, accountability results should not be used to sanction teachers in schools. Instead, they should be used to make sound judgments about quality and needed improvements. Descriptive information and identification of schools, teachers, and students that may require further attention.
Describing student growth The Colorado Growth Model was designed to describe student growth and learning. Quantile regression is used to model the complete distribution of student achievement over time. The model quantifies distance = growth rate time, probabilistically. Growth percentiles describe the rarity of a students current growth, given their prior achievement.
Examining growth with achievement sheds new light on school performance. Median growth above the 50 th percentile identifies best practices and sources that can offer support. Median growth below the 50 th percentile identifies greatest needs and targets that need to receive support. A gap-closing strategy is built around a consensus of school improvement.
Common yardstick Most states have administrative data that can be used as a common yardstick to identify the 25% most effective teachers. Supervisor ratings and classroom observations provide no such common yardstick. Local implementation of these other measures varies in 1600 school districts nationwide. More importantly, they do not directly reflect student learning.
Value-added and growth limitations Value-added and growth percentiles are only available for teachers in certain subject matter areas. Value-added and growth percentiles are available for only a small subset of teachers. Value-added and growth percentiles are limited by the test. Growth metrics are too narrow to provide information about how teachers can improve.
Value-added and growth shortcomings Value-added metrics and growth percentiles for individual teachers fluctuate from year to year. They can be influenced by factors beyond the teachers control. They are imperfect measures with a relatively large error component.
Concern How well does value-added predict the top 25% from year-to-year? How well do alternative measures of teacher effectiveness predict the same top 25% from year to year? Classroom observations? Principals ratings? Student surveys?
Value-added and growth compare favorably Value-added metrics and growth percentiles compare favorably with performance measures in other fields. The correlation between SAT test scores and freshman success in college is 0.35. The correlation in batting averages between years in professional baseball is 0.36. The correlation between value-added estimates this year and next lies between 0.20 and 0.60. While most value-added estimates correlate 0.30 and 0.40 between years.
Value-added and growth prognosis Recommend the use of value-added measures and growth percentiles, principally because they are related to student learning and growth. Are mindful of their limitations and imperfections. Strive to continually improve these growth measures.
Suggestion Use multiple measuresnot only value-added metrics and growth percentiles. Alternate measures should meaningfully supplement state test score data and increase prediction. Alternate measures should be applicable to a broader range of teachers. Provide direct information and feedback suggesting how teachers can improve teaching.
Suggestion Use core and non-core measures to validate the full range of teacher effectiveness for a broader range of teachers. Where growth measures benchmark the reliability of other teacher effectiveness measures. Key idea is to predict benchmark growth measures. Weight different measures based on their power to predict student learning and growth.
Observational measures What is needed is not so much an accounting of teacher time or a rating of teacher performance, but rather higher level inferences about the teachers ultimate purposes and effects. Making holistic judgments requires higher levels of inference. In short, we need a method to obtain holistic rankings reliably and validly. Procedures must minimize rater effects and coding errors.
Classroom Interactions A complex situation, difficult to characterize unassisted. Teacher practice and student-teacher interactions from the participants point of view. How do students and teachers interact in a practical and personal sort of way? How do they approach and solve problems together? Are there different classroom profiles?
Concourse of meaning The first challenge is to figure out what makes great teaching. This is difficult and controversial from an educational perspective. Yet relatively straightforward from a managerial perspective. Find the best educators and give them an opportunity to debate and create the best pedagogy and teaching practice.
Danielson Framework Charlotte Danielsons Framework serves as a source of statements about teacher effectiveness. The Framework is divided into: --4 Domains --23 Components --76 Elements --304 Items
Danielson Framework The 4 Domains include: --Planning and Preparation --The Classroom Environment --Instruction --Professional Responsibilities
Danielson Framework The 2 Domains that students actually see: --The Classroom Environment --Instruction
Danielson Framework Scoring rubrics: DanielsonNew York State UnsatisfactoryIneffective BasicDeveloping ProficientEffective DistinguishedHighly Effective
Danielson Framework Items: RubricItem Unsatisfactory Students not working with the teacher are disruptive to the class. Basic Small groups are only partially engaged while not working directly with the teacher. Proficient The students are productively engaged during small group work. Distinguished Students take the initiative with their classmates to ensure that their time is used productively.
Danielson Framework The Danielson Framework is prescriptive. Unsatisfactory and basic performance are often just the negation of proficient and distinguished performance. No guide to what teachers do when under stress. Good behavior follows rules. Lacks insight from control theory and negative feedback. Students help set high standards.
Danielson Framework A good basis for a limited number of items. These items can be readily supplemented with items from other sources, by other authors. Use these sources and create new items to fully cover what students and teachers actually do.
Growth, value-added and teacher effectiveness measures FeaturesStudent GrowthValue-Added Teacher- Effectiveness FocusStudentTeacher/Educator Questions addressed 1.How much did this student grow? 2.Is the student on track? 1.How does teacher-classroom growth compare to expected growth? 2.How does teacher-classroom growth compare to that of other teacher-classrooms? To what extent is the teacher/educator effective? Input variablesStudent scores only1.Student scores and their characteristics 2.Teacher characteristics 1.Multiple measures 2.Multiple methods Output1.Student achievement percentile 2.Student growth percentile 1.Teacher value-added metric 2.Teacher growth percentile 1.Effectiveness scores on individual measures 2.Composite score on multiple measures 3.Predicted comparable value- added metric 4.Predicted comparable growth percentile