What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and.

What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and Dissemination, National Comprehensive Center for Teacher Quality 2 nd Teacher Leader Institute Sponsored by the Delaware Academy for School Leadership College of Education and Human Development at the University of Delaware Camden, DE  June 19, 2012

2 Laura Goe, Ph.D. Former teacher in rural & urban schools  Special education (7 th & 8 th grade, Tunica, MS)  Language arts (7 th grade, Memphis, TN) Graduate of UC Berkeley’s Policy, Organizations, Measurement & Evaluation doctoral program Principal Investigator for the National Comprehensive Center for Teacher Quality Research Scientist in the Performance Research Group at ETS 2

3 The goal of teacher evaluation The ultimate goal of all teacher evaluation should be… TO IMPROVE TEACHING AND LEARNING

4 An aligned teacher evaluation system: Part I Teaching standards: high quality state or INTASC standards (taught in teacher prep program, reinforced in schools) Measures of teacher performance aligned with standards Evaluators (principals, consulting teachers, peers) trained to administer measures Instructional leaders (principals, coaches, support providers) to interpret results in terms of teacher development High-quality professional growth opportunities for individuals and groups of teachers with similar growth plans

5 An aligned teacher evaluation system: Part II Results from teacher evaluation inform evaluation of teacher evaluation system (including measures, training, and processes) Results from teacher evaluation inform planning for professional development and growth opportunities Results from teacher evaluation and professional growth are shared (with privacy protection) with teacher preparation programs Results from teacher evaluation and professional growth are used to inform school leadership evaluation and professional growth Results from teacher and leadership evaluation are used for school accountability and district/state improvement planning

6 A well-aligned evaluation system (Goe et al., 2012) In a well-aligned system, evidence of practice as it relates to high-quality teaching standards will:  Form the basis for a professional growth plan  Give structure and consistency to coaching and mentoring by providing the basis for shared expectations and a common language, and possibly suggesting a direction for development  Provide a diagnostic approach to understanding inadequate student learning growth (i.e., determining which standards are not being met and considering how they might relate to student outcomes)  Offer a set of criteria to help principals, consulting teachers, mentors, and others identify areas in which teachers are successful and areas for improvement

7 Six components in an aligned teacher evaluation/professional growth system 1.High-quality standards for instruction 2.Multiple standards-based measures of teacher effectiveness 3.High-quality training on standards, tools, and measures 4.Trained individuals to interpret results and make professional development recommendations 5.High-quality professional growth opportunities for individuals and groups of teachers 6.High-quality standards for professional learning

8 Chicago Study (Sartain et al., 2011) “The work in Chicago and across the country to improve evaluation was motivated by two main factors. First, evaluation systems were failing to give teachers either meaningful feedback on their instructional practices or guidance about what is expected of them in the classroom. Second, traditional teacher evaluation systems were not differentiating among the best teachers, good teachers, and poor teachers.” (p. 1, italics added)

9 Summative vs. Formative Assessment: Students Summative assessment measures students’ knowledge and skills at a particular point in time  Frequently used for accountability purposes Formative assessment takes place before or during the instruction  Used for eliciting evidence about students’ knowledge and skills to develop a better understanding of current learning progress and to make adjustments (Black et al, 2003)  Minute-to-minute (a class poll), after a lesson to check understanding (exit tickets), or after a unit to determine what students know prior to moving to next unit (unit test)

10 Summative vs. Formative Assessment: Teachers Summative assessment combines evidence from multiple measures according to the state or district weighting formula and gives teachers a score at the end of the year  Evidence may be collected throughout the year, and may be used for formative purposes as well Formative assessment takes place throughout teachers’ careers, focusing on classroom practice and/or artifacts of teaching  Considers evidence about teachers’ knowledge and skills that can be used to improve current practice

11 Can we use both formative and summative assessment to evaluate teachers? Popham (1998): “fixing” (formative) the teacher vs. “firing” (summative) the teacher Milanowski (2005): Study with new teachers found “…no major differences between the groups in terms of openness to discussion of difficulties, reception and acceptance of performance feedback, stress, turnover intentions, actual turnover, or performance improvement.” (p. 153)

12 Multiple measures of teacher effectiveness Evidence of growth in student learning and competency  Standardized tests, pre/post tests in untested subjects  Student performance (art, music, etc.)  Curriculum-based tests given in a standardized manner  Classroom-based tests such as DIBELS Evidence of instructional quality  Classroom observations  Lesson plans, assignments, and student work  Student surveys such as Harvard’s Tripod  Evidence binder (next generation of portfolio) Evidence of professional responsibility  Administrator/supervisor reports, parent surveys  Teacher reflection and self-reports, records of contributions

13 Questions to ask about each measure used How will using this measure in the teacher evaluation system impact teaching and learning in classrooms and schools? How will the use of this measure look different in low-capacity vs. high-capacity schools? How will reporting on results from this measure be done (to provide actionable information to teachers, principals, schools, districts, teacher preparation programs, and the state)? How will we know if this measure is working as we intended?

14 Measures that may contribute to teachers’ professional growth Measures which include protocols and processes that teachers can examine and comprehend Measures that are directly and explicitly aligned with teaching standards Measures that motivate teachers to examine their own practice against specific standards Measures that allow teachers to participate in or co-construct the evaluation (such as portfolios) Measures that give teachers opportunities to discuss the results for formative purposes with evaluators, administrators, teacher learning communities, mentors, coaches, etc. Measures that are aligned with and used to inform professional growth and development offerings

15 Why teachers generally value observations Observations are the traditional measure of teacher performance Teachers feel they have some control over the process and outcomes They report that having a conversation with the observation and receiving constructive feedback after the observation is greatly beneficial Evidence-centered discussions can help teachers improve instruction Peer evaluators often report that they learn new teaching techniques

16 When teachers don’t value observations, it’s because… They do not receive feedback at all The feedback they receive is not specific and actionable The observer suggests actions but is unable to offer the means and resources to carry out those actions  Mentors/coaches, other support personnel  Time for individual growth planning/activities  Protected time for collaboration with others

17 Value-added models Many variations on value-added models  TVAAS (Sander’s original model) typically uses 3+ years of prior test scores to predict the next score for a student - Used since the 1990’s for teachers in Tennessee, but not for high-stakes evaluation purposes  There are other models that use less student data to make predictions  Considerable variation in “controls” used (demographics, school effects)

18 Growth vs. Proficiency Models End of YearStart of School Year Achievement Proficient Teacher B: “Failure” on Ach. Levels Teacher A: “Success” on Ach. Levels In terms of growth, Teachers A and B are performing equally Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

19 Growth vs. Proficiency Models (2) End of YearStart of School Year Achievement Proficient Teacher A Teacher B A teacher with low- proficiency students can still be high in terms of GROWTH (and vice versa) Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

20 What value-added and growth models cannot tell you Value-added and growth models are really measuring classroom, not teacher, effects Value-added models can’t tell you why a particular teacher’s students are scoring higher than expected  Maybe the teacher is focusing instruction narrowly on test content  Or maybe the teacher is offering a rich, engaging curriculum that fosters deep student learning. How the teacher is achieving results matters!

21 Recommendation from NBPTS Task Force (Linn et al., 2011) Recommendation 2: Employ measures of student learning explicitly aligned with the elements of curriculum for which the teachers are responsible. This recommendation emphasizes the importance of ensuring that teachers are evaluated for what they are teaching.

22 Measuring teachers’ contributions to student learning growth: A summary of current models ModelDescription Student learning objectives Teachers assess students at beginning of year and set objectives then assesses again at end of year; principal or designee works with teacher, determines success Subject & grade alike team models (“Ask a Teacher”) Teachers meet in grade-specific and/or subject-specific teams to consider and agree on appropriate measures that they will all use to determine their individual contributions to student learning growth Content CollaborativesContent experts (external) identify measures and groups of content teachers consider the measures from the perspective of classroom use; may not include pre- and post measures Pre-and post-tests modelIdentify or create pre- and post-tests for every grade and subject School-wide value-addedTeachers in tested subjects & grades receive their own value-added score; all other teachers get the school-wide average

23 School-wide VAM illustration for middle-school Tested subject teachers receive their own value-added score while non-tested subject teachers receive a school-wide average for their value-added score

24 Differentiating among teachers “It is nearly impossible to discover and act on performance differences among teachers when documented records show them all to be the same.” (Glazerman et al., 2011, pg 1)

25 Using student learning outcomes to inform teacher professional growth MOST helpful: Student assessments (including 4Ps) that provide information teachers can use immediately to adjust instructional strategies, such as results from benchmark or interim assessment or essays scores with rubrics LEAST helpful: Student assessments that provide a snapshot of students’ skills at a single point in time after most instruction is complete, such as last year’s state standardized test results

26 Computerized adaptive assessments Unlike end-of-year paper-and pencil standardized tests, computerized adaptive assessments (CAAs) can be used at the beginning and end of the year to measure growth (and in the middle, too, as a way to establish progress) If used in this way, the results can be useful to teachers in adjusting their instruction  The process of analyzing data from this type of assessment requires connecting student results to specific instructional content and practices

27 Professional growth through examining student learning results Throughout the year teachers have an opportunity to collect evidence on student learning, analyze it, and adjust content or teaching practice  Test results are most informative when combined with other evidence on student knowledge and skills  Tests may cover only half the standards  Multiple choice doesn’t allow students to demonstrate other types of skills (writing, synthesizing, constructing an argument, presenting)

28 The 4 Ps (Projects, Performances, Products, Portfolios) Yes, they can be used to demonstrate teachers’ contributions to student learning growth Here’s the basic approach  Use a high-quality rubric to judge initial knowledge and skills required for mastery of the standard(s)  Use the same rubric to judge knowledge and skills at the end of a specific time period (unit, grading period, semester, year, etc.)

29 Interpreting results for alignment with teacher professional learning options Different approach; not looking at “absolute gains” Requires ability to determine and/or link student outcomes to what likely happened instructionally Requires ability to “diagnose” instruction and recommend/and or provide appropriate professional growth opportunities  Individual coaching/feedback on instruction  Observing “master teachers”  Group professional development (when several teachers have similar needs)

30 Feedback (Coggshall et al., 2012) “Feedback on whether or not instructional practices are working can come in the form of student learning data, the teachers’ own observations of student engagement, observations from a peer or a coach, a video-taped record of the practice, discussion within a professional learning community, or the results of a formal evaluation.” (p. 6)

31 Memphis professional development system Teaching and Learning Academy began April ‘96 Nationally commended program intended to  “…provide a collegial place for teachers, teacher leaders and administrators to meet, study, and discuss application and implementation of learning…to impact student growth and development” Practitioners propose and develop courses  Responsive to school/district evaluation results  Offerings must be aligned with NSDC standards  ~336 On-line and in-person courses, many topics

32 Questions about implementation 1.What steps will you take to ensure that evaluation feedback given to teachers will impact teaching and learning in classrooms and schools? 2.How will the evaluation/feedback cycle look different in isolated rural schools? Hard-to-staff urban schools? High-performing schools? 3.How will we know if evaluation system (including the feedback cycle) is working as we intended?

33 Final thoughts The limitations:  There are no perfect measures  There are no perfect models  Changing the culture of evaluation is hard work The opportunities:  Evidence can be used to trigger support for struggling teachers and acknowledge effective ones  Multiple sources of evidence can provide powerful information to improve teaching and learning  Evidence is more valid than “judgment” and provides better information for teachers to improve practice

34 Resources Memphis Professional Development System  Main site: http://www.mcsk12.net/admin/tlapages/academyhome.asp http://www.mcsk12.net/admin/tlapages/academyhome.asp  PD Catalog: http://www.mcsk12.net/aoti/pd/docs/PD%20Catalog%20Sp ring%202011lr.pdf http://www.mcsk12.net/aoti/pd/docs/PD%20Catalog%20Sp ring%202011lr.pdf  Individualized Professional Development Resource Book: http://www.mcsk12.net/aoti/pd/docs/Individualized%20Gro wth%20Resource%20Book.pdf http://www.mcsk12.net/aoti/pd/docs/Individualized%20Gro wth%20Resource%20Book.pdf 34

35 References Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003). Assessment for Learning: Putting it into Practice. Buckingham, UK: Open University Press. http://www.amazon.com/Assessment-Learning-Putting-into-Practice/dp/0335212972 Coggshall, J. G., Rasmussen, C., Colton, A., Milton, J., & Jacques, C. (2012). Generating teaching effectiveness: The role of job-embedded professional learning in teacher evaluation. Washington, DC: National Comprehensive Center for Teacher Quality. http://www.tqsource.org/publications/GeneratingTeachingEffectiveness.pdf Glazerman, S., Goldhaber, D., Loeb, S., Raudenbush, S., Staiger, D. O., & Whitehurst, G. J. (2011). Passing muster: Evaluating evaluation systems. Washington, DC: Brown Center on Education Policy at Brookings. http://www.brookings.edu/reports/2011/0426_evaluating_teachers.aspx# Goe, L., Biggers, K., & Croft, A. (2012). Linking teacher evaluation to professional development: Focusing on improving teaching and learning. Washington, DC: National Comprehensive Center for Teacher Quality. http://www.tqsource.org/publications/LinkingTeacherEval.pdf Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student learning, student achievement: How do teachers measure up? Arlington, VA: National Board for Professional Teaching Standards. http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305

36 References (cont’d) Milanowski, A. (2005). Split roles in performance evaluation: A field study involving new teachers. Journal of Personnel Evaluation in Education, 18(3), 153-169. http://www.springerlink.com/content/65817gu562k25482/fulltext.pdf Polikoff, M. S. (2011). How well aligned are state assessments of student achievement with state content standards? American Educational Research Journal, 48(4), 965-995. http://aer.sagepub.com/content/48/4/965.abstract?rss=1 Popham, W. J. (1998). The dysfunctional marriage of formative and summative teacher evaluation. Journal of Personnel Evaluation in Education(1), 269-273. https://springerlink3.metapress.com/content/pr2171452m314w21/ Sartain, L., Stoelinga, S. R., & Brown, E. R. (2011). Rethinking teacher evaluation in Chicago: Lessons learned from classroom observations, principal-teacher conferences, and district implementation. Chicago: Consortium on Chicago School Research at the University of Chicago. http://ccsr.uchicago.edu/sites/default/files/publications/Teacher%20Eval%20Report%20FINAL.pd f

37 Laura Goe, Ph.D. 609-619-1648 lgoe@ets.org www.lauragoe.com https://twitter.com/GoeLaura National Comprehensive Center for Teacher Quality 1000 Thomas Jefferson Street, NW Washington, D.C. 20007 www.tqsource.org

What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and.

Similar presentations

Presentation on theme: "What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and.

Similar presentations

Presentation on theme: "What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and."— Presentation transcript:

Similar presentations

About project

Feedback