1 Deborah Ball, Hyman Bass, MerrieBlunk, Katie Brach, Teacher Quality, Quality Teaching, and Student Outcomes: Measuring the RelationshipsHeather C. HillDeborah Ball, Hyman Bass, MerrieBlunk, Katie Brach,CharalambosCharalambous, Carolyn Dean, Séan Delaney, Imani Masters Goffney, Jennifer Lewis, Geoffrey Phelps, Laurie Sleep, Mark Thames, Deborah Zopf
2 Measuring teachers and teaching Traditionally done at entry to profession (e.g., PRAXIS) and later ‘informally’ by principalsIncreasing push to measure teachers and teaching for specific purposes:Paying bonuses to high-performing teachersLetting go of under-performing (pre-tenure) teachersIdentifying specific teachers for professional developmentIdentifying instructional leaders, coaches, etc.Not going to give the long history of measuring teachers here, but….wanted to give a sense for history, what’s out there, and what’s coming down the pike.In Boston, 97% of teachers received highest rating.
3 Methods for identification Value-added scoresAverage of teachers’ students’ performance this year differenced from same group of students’ performance last yearIn a super-fancy statistical modelTypically used for pay-for-performance schemesProblemsSelf-report / teacher-initiatedTypically used for leadership positions, professional dev.However, poor correlation with mathematical knowledgeR= 0.25
4 Identification: Alternative Methods Teacher characteristicsNCLB’s definition of “highly qualified”More direct measuresEducational production function literatureDirect measures of instructionCLASS (UVA)—general pedagogyDanielson, Saphier, TFA—dittoBut what about mathematics-specific practices?
5 Purpose of talkTo discuss two related efforts at measuring mathematics teachers and mathematics instructionTo highlight the potential uses of these instrumentsResearchPolicy?
6 Begin With PracticeClips from two lessons on the same content – subtracting integersWhat do you notice about the instruction in each mathematics classroom?How would you develop a rubric for capturing differences in the instruction?What kind of knowledge would a teacher need to deliver this instruction? How would you measure that knowledge?Middle school, southwestern district in USWant to show you these instruments because it’ll mimic our own process of developing this instrument
7 Bianca Teaching material for the first time (Connected Mathematics) Began day by solving 5-7 with chipsRed chips are a negative unit; blue chips are positiveNow moved to 5 – (-7)Set up problem, asked students to used chipsGiven student work time
8 Question What seems mathematically salient about this instruction? What mathematical knowledge is needed to support this instruction?
9 Mercedes Early in teaching career Also working on integer subtraction with chips from CMPMercedes started this lesson previous day, returns to it again
10 Find the missing part for this chip problem Find the missing part for this chip problem. What would be a number sentence for this problem?Start WithRuleEnd WithAdd 5Subtract 3
11 Questions What seems salient about this instruction? What mathematical knowledge is needed to support this instruction?
12 What is the same about the instruction? Both teachers can correctly solve the problems with chipsBoth teachers have well-controlled classroomsBoth teachers ask students to think about problem and try to solve it for themselves
13 What is different?Mathematical knowledgeInstruction
14 Observing practice…Led to the genesis of “mathematical knowledge for teaching”Led to “mathematical quality of instruction”
16 MKT Itemscreated an item bank of for K-8 mathematics in specific areas (see (Thanks NSF)About 300 itemsItems mainly capture subject matter knowledge side of the eggProvide items to field to measure professional growth of teachersNOT for hiring, merit pay, etc.
17 MKT FindingsCognitive validation, face validity, content validityHave successfully shown growth as a result of prof’l developmentConnections to student achievement - SIIQuestionnaire consisting of 30 items (scale reliability .88)Model: Student Terra Nova gains predicted by:Student descriptors (family SES, absence rate)Teacher characteristics (math methods/content, content knowledge)Teacher MKT significantSmall effect (< 1/10 standard deviation): weeks of instructionBut student SES is also about the same size effect on achievement(Hill, Rowan, and Ball, AERJ, 2005)What’s connection to mathematical quality of instruction??
18 History of Mathematical Quality of Instruction (MQI) Originally designed to validate our mathematical knowledge for teaching (MKT) assessmentsInitial focus: How is teachers’ mathematical knowledge visible in classroom instruction?Transitioning to: What constitutes quality in mathematics instruction?Disciplinary focusTwo-year initial development cycle ( )Two versions since then
19 MQI: Sample Domains and Codes Richness of the mathematicse.g., Presence of multiple (linked) representations, explanation, justification, multiple solution methodsMathematical errors or imprecisionse.g., Computational, misstatement of mathematical ideas, lack of clarityResponding to studentse.g., Able to understand unusual student-generated solution methods; noting and building upon students’ mathematical contributionsCognitive level of student workMode of instruction
20 Initial study: Elementary validation Questions:Do higher MKT scores correspond with higher-quality mathematics in instruction?NOT about “reform” vs. “traditional” instructionInstead, interested in the mathematics that appears
21 Method 10 K-6 teachers took our MKT survey Videotaped 9 lessons per teacher3 lessons each in May, October, MayAssociated post-lesson interviews, clinical interviews, general interviews
22 Elementary validation study Coded tapes blind to teacher MKT scoreCoded at each codeEvery 5 minutesTwo coders per tapeAlso generated an “overall” code for each lesson – low, medium, high knowledge use in teachingAlso ranked teachers prior to uncovering MKT scores
23 Projected Versus Actual Rankings of Teachers Projected ranking of teachers:Actual ranking of teachers (using MKT scores):Correlation of .79(p < .01)Hill, H.C. et al., (2008) Cognition and Instruction
24 Correlations of Video Code Constructs to Teacher Survey Scores Construct (Scale)Correlation to MKT scoresResponds to students0.65*Errors total-0.83*Richness of mathematics0.53One of the next steps was to correlate the video code scale scores (what Heather earlier referred to as constructs) to teachers’ multiple choice measure scores. Here I’ve listed some of the scales you’ve heard mentioned along with their correlations to the measure scores. Although only one scale listed here is significantly related to the measure scores, all of these correlations are pretty big on the grand scale of educational measurement. All our other scales are of similar magnitude and are described further in my paper. Again, these correlations suggest that the survey measures and the video codes are both assessing mathematical knowledge for teaching.*significant at the .05 level
25 Validation Study II: Middle School Recruited 4 schools by value-added scoresHigh (2), Medium, LowRecruited every math teacher in the schoolAll but two participated for a total of 24Data collectionStudent scores (“value-added”)Teacher MKT/surveyInterviewsSix classroom observationsFour required to generalize MQI; used 6 to be sure
26 Validation study II: Coding Revised instrument contained many of same constructsRich mathematicsErrorsResponding to studentsLesson-based guess at MKT for each lesson (averaged)Overall MQI for each lesson (averaged to teacher)G-study reliability: 0.90
27 Validation Study II: Value-added scores All district middle school teachers (n=222) used model with random teacher effects, no school effectsThus teachers are normed vis-à-vis performance of the average student in the districtScores analogous to ranksRan additional models; similar results*Our study teachers’ value-added scores extracted from this larger dataset
28 Results MKT MQI Lesson-based MKT Value-added score* 1.0 0.53** 0.72** 0.41*0.85**0.45*0.66**Value added scoreSignificant at p<.05Significant at p<.01Source: Hill, H.C., Umland, K. &Kapitula, L. (in progress) Validating Value-AddedScores: A Comparison with Characteristics of Instruction. Harvard GSE: Authors.
29 Additional Value-Added Notes Value-added and average of:Connecting classroom work to math: 0.23Student cognitive demand: 0.20Errors and mathematical imprecision: -0.70**Richness: 0.37***As you add covariates to the model, most associations decreaseProbably result of nesting of teachers within schoolsOur results show a very large amount of “error” in value-added scores
31 Proposed Uses of Instrument ResearchDetermine which factors associate with student outcomesCorrelate with other instruments (PRAXIS, Danielson)Instrument included as part of the National Center for Teacher Effectiveness, Math Solutions DRK-12 and Gates value-added studies (3)Practice??Pre-tenure reviews, rewardsPutting best teachers in front of most at-risk kidsSelf or peer observation, professional development
32 Problems Instrument still under construction and not finalized G-study with master coders indicates we could agree more among ourselvesTraining only done twice, with excellent/needs work resultsEven with strong correlations, significant amount of “error”Standards required for any non-research use are highKEY: Not yet a teacher evaluation tool
33 Next Constructing grade 4-5 student assessment to go with MKT items Keep an eye on use and its complicationsQuestions?