Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive Center for Teacher Quality Utah Educator Evaluation Summit: Improving Instructional Quality Educator Effectiveness Project Tuesday, October 4, 2011  Salt Lake City, UT 1

2 Today’s presentation available online To download a copy of this presentation or look at it on your iPad, smart phone or laptop, go to www.lauragoe.comwww.lauragoe.com  Go to Publications and Presentations page  Today’s presentation is at the bottom of the page 2

3 The goal of teacher evaluation

4 Measures: The right choice depends on what you want to measure

5 Measures that help teachers grow Measures that motivate teachers to examine their own practice against specific standards Measures that allow teachers to participate in or co-construct the evaluation (such as “evidence binders”) Measures that give teachers opportunities to discuss the results with evaluators, administrators, colleagues, teacher learning communities, mentors, coaches, etc. Measures that are directly and explicitly aligned with teaching standards Measures that are aligned with professional development offerings Measures which include protocols and processes that teachers can examine and comprehend

6 Considerations for choosing and implementing measures Consider whether human resources and capacity are sufficient to ensure fidelity of implementation Conserve resources by encouraging districts to join forces with other districts or regional groups Establish a plan to evaluate measures to determine if they can effectively differentiate among teacher performance Examine correlations among measures Evaluate processes and data each year and make needed adjustments

7 Most popular growth models: Value-added and Colorado Growth Model EVAAS uses prior test scores to predict the next score for a student Teachers’ value-added is the difference between actual and predicted scores for a set of students http://www.sas.com/govedu/edu/k12/evaas/index.ht mlhttp://www.sas.com/govedu/edu/k12/evaas/index.ht ml Colorado Growth model  Betebenner 2008: Focus on “growth to proficiency”  Measures students against “academic peers”  www.nciea.org www.nciea.org

8 Slide courtesy of Damian Betebenner at www.nciea.orgwww.nciea.org Linking student learning results to professional growth opportunities

9 Value-Added: Student effects “A teacher who teaches less advantaged students in a given course or year typically receives lower-effectiveness ratings than the same teacher teaching more advantaged students in a different course or year.” “Models that fail to take student demographics into account further disadvantage teachers serving large numbers of low-income, limited English proficient, or lower-tracked students.” (Newton et al., 2010, pg 2)

10 Value-Added: Error rates and stability “Type I and II error rates for comparing a teacher’s performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data.” “Any practical application of value-added measures should make use of confidence intervals in order to avoid false precision, and should include multiple years of value-added data in combination with other sources of information to increase reliability and validity.” (Schochet & Chiang, 2010, abstract)

11 Value-Added: Subscales Teachers’ scores on subscales of a test can yield very different results, which also raises the question of weighting subscale results (Lockwood et al, 2007)  Lockwood et al. found substantial variation in teachers’ rankings based on the subscales (“Problem Solving” and “Procedures”)  More variation within teachers than across teachers “Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests” (Lockwood et al, 2007, pg. 55)

12 Value-Added: Test content Polikoff and colleagues (2011) found that  About half of standards are tested - If half the standards they are teaching are not tested, how can the test accurately reflect teachers’ contribution to student learning?  About half of test content corresponds with grade/subject standards - If half of test content is material that is not in the standards teachers are supposed to be teaching, is if fair to hold teachers accountable for test results?

13 Value-Added: Multiple teachers In one study, 21% of teachers in Washington, DC had students who had also been in another math teacher’s class that year (Hock & Isenberg, 2011)  This covered all situations, including students who had changed classes or schools as well as co-teaching and other cases where students were taught by more than one teacher  Hock & Isenberg determined best estimates were obtained by counting student multiple times, for each teacher the student had, rather than trying to account for how much each teacher contributed to students’ scores

14 Value-Added: Responses to technical challenges Use multiple years of data to mitigate sorting bias and gain stability in estimates (Koedel & Betts, 2009; McCaffrey et al., 2009; Glazerman et al., 2010 ) Use confidence intervals and other sources of information to improve reliability and validity of teacher effectiveness ratings (Glazerman et al., 2010) Have teachers and administrators verify rosters to ensure scores are calculated with students the teachers actually taught Consider the importance of subscores in teacher rankings

15 Teacher evaluation models

16 Measuring teachers’ contributions to student learning growth: A summary of current models ModelDescription Student learning objectives Teachers assess students at beginning of year and set objectives then assesses again at end of year; principal or designee works with teacher, determines success Subject & grade alike team models (“Ask a Teacher”) Teachers meet in grade-specific and/or subject-specific teams to consider and agree on appropriate measures that they will all use to determine their individual contributions to student learning growth Pre-and post-tests model Identify or create pre- and post-tests for every grade and subject School-wide value- added Teachers in tested subjects & grades receive their own value-added score; all other teachers get the school- wide average

17 Washington DC IMPACT: Instructions for teachers in non-tested subjects/grades “In the fall, you will meet with your administrator to decide which assessment(s) you will use to evaluate your students’ achievement. If you are using multiple assessments, you will decide how to weight them. Finally, you will also decide on your specific student learning targets for the year. Please note that your administrator must approve your choice of assessments, the weights you assign to them, and your achievement targets. Please also note that your administrator may choose to meet with groups of teachers from similar content areas rather than with each teacher individually.” 17

18 Validity is a process Starts with defining the criteria and standards you want to measure Requires judgment about whether the instruments and processes are giving accurate, helpful information about performance Verify validity by  Comparing results on multiple measures  Multiple time points, multiple raters

19 Standards clearly define learning expectations for the subject area and each grade level The assessment instruments have be designed to yield scores that can accurately and fairly reflect: student achievement of the standards student learning growth over the course of the year There is evidence that the assessment instruments accurately and fairly measure the learning expectations There is evidence that assessment scores represent teachers’ contribution to student growth THEN IF AND IF There is evidence that student growth scores accurately and fairly measure student progress over the course of the year AND IF Interpretation of scores may be appropriately used to inform judgments about teacher effectiveness Figure from Herman, J. L., Heritage, M., & Goldschmidt, P. (2011). AND IF

20 Reliability Test scores for an individual would be approximately the same if nothing changed between two testing periods  Key is to ensure that all test forms (such as pre- and post-tests) are measuring the same thing  Differences between Time A and Time B should reflect only the changes in knowledge and skills on the content being assessed The difference between students’ scores between Time A and Time B could then be potential evidence of the teachers’ contribution to student learning growth

21 Assessments for student learning growth

22 What assessments are teachers and schools going to use? Existing measures  Curriculum-based assessments (come with packaged curriculum)  Classroom-based individual testing (DRA, DIBELS)  Formative assessments such as NWEA  Progress monitoring tools (for Response to Intervention)  National tests, certifications tests Rigorous new measures (may be teacher created) The 4 Ps: Portfolios/products/performance/projects School-wide or team-based growth Pro-rated scores in co-teaching situations Student learning objectives Any measure that demonstrates students’ growth towards proficiency in appropriate standards

23 New Haven assessment examples Examples of Assessments/Measures  Basic literacy assessments, DRA  District benchmark assessments  District Connecticut Mastery Test  LAS Links (English language proficiency for ELLs)  Unit tests from NHPS approved textbooks  Off-the-shelf standardized assessments (aligned to standards)  Teacher-created assessments (aligned to standards)  Portfolios of student work (aligned to standards)  AP and International Baccalaureate exams

4 types of musical behaviors: Types of assessment 1.Responding 2.Creating 3.Performing 4.Listening 1.Rubrics 2.Playing tests 3.Written tests 4.Practice sheets 5.Teacher Observation 6.Portfolios 7.Peer and Self- Assessment Assessing Musical Behaviors: The type of assessment must match the knowledge or skill Slide used with permission of authors Carla Maltas, Ph.D. and Steve Williams, M.Ed. See reference list for details.

25 How to use evidence of student learning growth Teacher preparation for measuring student learning growth is limited or non-existent Most principals, support providers, instructional managers, and coaches are poorly prepared to make judgments about teachers’ contribution to student learning growth They need to know how to  Evaluate the appropriateness of various measures of student learning for use in teacher evaluation - Work closely with teachers to select appropriate student growth measures and ensure that they are using them correctly and consistently

26 Using the measures in comparable ways Even if all teachers are using the same measures in a grade/subject, they may be using them in different ways  Giving the assessment at different times of the year  Allowing students more time to complete the assessment  Engaging in test prep or coaching students in completing assessments To ensure that differences in student scores are based on teacher performance, not on how/when the assessment was given, “standardize” assessment processes as much as possible

27 Collect evidence in a standardized way (to the extent possible) Evidence of student learning growth  Locate or develop rubrics with explicit instructions and clear indicators of proficiency for each level of the rubric  Establish time for teachers to collectively examine student work and come to a consensus on performance at each level - Identify “anchor” papers or examples  Provide training for teachers to determine how and when assessments should be given, and how to record results in specific formats

28 Scoring

29 DC Impact: Score comparison for Groups 1-3 Group 1 (tested subjects) Group 2 (non- tested subjects Group 3 (special education) Teacher value- added (based on test scores) 50%0% Teacher-assessed student achievement (based on non-VAM assessments) 0%10% Teacher and Learning Framework (observations) 35%75%55% 29

30 New York’s scoring system

31 Washington DC IMPACT: Rubric for Determining Success (for teachers in non- tested subjects/grades)

32 Washington DC IMPACT: Rubric for Determining Success (for teachers in non- tested subjects/grades)

33 Results inform professional growth opportunities Are evaluation results discussed with individual teachers? Do teachers collaborate with instructional managers to develop a plan for improvement and/or professional growth?  All teachers (even high-scoring ones) have areas where they can grow and learn Are effective teachers provided with opportunities to develop their leadership potential? Are struggling teachers provided with coaches and given opportunities to observe/be observed?

34 High-quality professional growth opportunities The ultimate goal of teacher evaluation should be to improve teaching & learning  Individual coaching/feedback on instruction - Trained coaches, not just “good teachers”  Observing “master teachers” - Provide opportunities to discuss specific practices - May be especially helpful at beginning of year when master teachers are creating a “learning environment”  Group PD and learning communities - Opportunity to grow together as a cohort

35 Moving forward Create (or revisit) timeline for final decisions and implementation  What do you still need to know to make appropriate recommendations? Who needs to be involved in decision-making?  Department of Ed? Districts? Teachers? Union?  State responsibilities vs. district responsibilities How/when will decisions be communicated to stakeholders? What resources will be required for training and implementation and where will they come from?

36 References Herman, J. L., Heritage, M., & Goldschmidt, P. (2011). Developing and selecting measures of student growth for use in teacher evaluation. Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST). http://www.aacompcenter.org/cs/aacc/view/rs/26719 Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student learning, student achievement: How do teachers measure up? Arlington, VA: National Board for Professional Teaching Standards. http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305 Malta, C., and Williams, S. (January 27, 2010). Meaningful assessment in the music classroom. Presented at Missouri Music Educators Association Conference, Jefferson City, MO. http://dese.mo.gov/divimprove/curriculum/fa/AssessmentintheMusicClassroom.pptx Race to the Top Application http://www2.ed.gov/programs/racetothetop/resources.html Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417 - 458. http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: The New Teacher Project. http://widgeteffect.org/downloads/TheWidgetEffect.pdf

37 Questions?

38 Laura Goe, Ph.D. P: 609-734-1076 E-Mail: lgoe@ets.org Website: www.tqsource.org

Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Similar presentations

Presentation on theme: "Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Similar presentations

Presentation on theme: "Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive."— Presentation transcript:

Similar presentations

About project

Feedback