Presentation on theme: "Rochester City School District 2010 Symposium Improving Student Achievement While Overcoming Adversity Kent Gardner, PhD, President Center for Governmental."— Presentation transcript:
Rochester City School District 2010 Symposium Improving Student Achievement While Overcoming Adversity Kent Gardner, PhD, President Center for Governmental Research
Inform & Empower CGR Practical Educational Program Evaluation Challenges & Issues Examples 2001 WIN Schools Evaluation 2005 Rochester Charter Schools Harvard NYC Charter Stanford National Charter Middle College Hillside Work-Scholarship Connection
Inform & Empower CGR What’s the goal? Middle College: College prep Hillside Work-Scholarship Connection (HWSC): “Graduation is the Goal” Who decides? What if the endeavor has multiple goals? Can you monitor progress by measuring intermediate or process goals?
Inform & Empower CGR What does success look like? Does the goal have a measurable outcome? Graduation is relatively easy to measure How do you measure college readiness? Are there intermediate outcomes that are measurable? Attendance Credits accumulated Which intermediate outcomes contribute most powerfully to the final outcome?
Inform & Empower CGR Data Pitfalls Why were the data collected? Unemployment insurance NYS’s checkbook School lunch If you intend to adapt data to a new use, are they accurate enough for the new purpose?
Inform & Empower CGR
Inform & Empower CGR Data Pitfalls Bias/Fraud High stakes tests: NYSED cut scores Attendance Suspensions Consistency Elementary grades across classes, schools Coding across years Coding across data systems—attendance can vary depending how & when measured
Inform & Empower CGR Assessing impact Consider how the program affects outcomes— really want to compare how the outcomes for individual students would have been different had they not participated Instead, we compare outcomes for the “experimental” group (HWSC or Middle College participants, for example) to those of students who did not participate Challenges What’s the comparison group? All others who might have participated? Can you control for all differences?
Inform & Empower CGR Matched Group Comparison Experimental design is the “platinum standard” Random assignment to either control or experimental group “Double blind” to avoid placebo effect Assignment from homogeneous population Random assignment Challenging—how do you find a context in which you can randomly select Costly—if you want to be sure of drawing from a homogeneous population, you need a big sample
Inform & Empower CGR Fallback from random assignment When random assignment infeasible or too costly, revert to “quasi-experimental” design: “Control group” is created by a process of selecting similar students Case Control: match one to one based on common characteristics Propensity Score Matching
Inform & Empower CGR Propensity Score Matching Sophisticated statistical technique: Creates a statistical model that predicts group membership according to available characteristics of participants “Retroactive” selection of control group: Can employ large data sets, including demographic characteristics, test scores prior to program participation, etc. & guarantee a control group of a predetermined size Students “in program” can be matched to multiple students not in program—1:1, 1:3, 1:5 matching proportions possible depending on size of comparison population Still can’t control for unseen factors—family characteristics, motivation, etc.—that may be consistently different in one group over the other
Inform & Empower CGR ‘01:Wegman Inner City Voucher (WIN) 98% of enrolled students in 6 inner city Catholic schools supported by WIN vouchers Case control model matching WIN students with demographically- comparable students from RCSD “schools of choice” (15, 20, 57, 58) Intended to acknowledge motivational difference between Catholic & public school families Matched on age, sex, race, F/RPL, mother’s education Poverty higher at WIN schools Comparisons? Compared Iowa Test of Basic Skills trend performance against ITBS national norms Common assessment across schools was 4 th grade ELA & Math scores for both WIN and schools of choice Couldn’t adjust for “starting point” as conversion from Stanford 9 to ITBS unreliable Conclusion: WIN and students from schools of choice performed about the same on 4 th grade ELA & Math
Inform & Empower CGR ‘05: Rochester Charter Schools CGR engaged by Gleason Foundation to monitor performance of newly-formed charter schools for first five years (beginning 2000) Expect “selection bias” for charter lottery applicants? Motivation, prior achievement Solution: Follow students not accepted by lottery RCSD facilitated monitoring of state & local tests for students enrolled in charter schools & in lottery, but remaining in traditional schools Created “value added” achievement using scores from year prior to enrollment for both groups Findings Attrition in both groups made comparisons difficult Yet findings supported conclusion that two large charter schools (Edison & National Heritage) underperformed RCSD schools Both schools were closed by NYS Charter Schools Institute
Inform & Empower CGR Harvard School of Ed (Caroline Hoxby): New York City Charter Schools Adopted same approach used by CGR in 2000: “lotteried in” v. “lotteried out” All lottery participants more black (64% v. 34%), more poor (F/RPL 92% v. 72%) than all NYC public school students Hispanic 29%/38% ELL 4%/14%; SPED 11%/13% Different in other ways? Findings “Lotteried out” students remained on grade level in traditional NYC public schools, outperforming NYC students similarly disadvantaged “Lotteried in” did better Key point: Studying only students who were part of a lottery “controls” for unseen factors like family motivation, etc.
Inform & Empower CGR Stanford CREDO (Mackie Raymond): Multistate study Employed state administrative records to create “pairwise comparison” of individual students in 15 states Matched on grade‐level, gender, race/ethnicity, F/RPL, ELL, SPED, prior test score on state achievement tests Profile 27% black, 30% Hispanic 7% ELL, 7% SPED 49% F/RPL
Inform & Empower CGR Middle College RCSD/RIT program aimed at “college readiness” for three Franklin high schools Measurement problematic—How define college readiness? How assess college readiness? Agreement on goals and objectives varied across RCSD & RIT faculty One measurement idea, “before and after” ACCUPLACER scores, proved unrealistic CGR’s role evolved to be more about process than outcome
Inform & Empower CGR Hillside Work-Scholarship Connection Focus on critical output indicator: Graduation rates Through , CGR studies based on one- to-one match of HWSC participants to RCSD students Matching conducted by individuals on Accountability staff Matched on by age, gender, race/ethnicity, F/RPL participation, grade, prior year GPA
Inform & Empower CGR HWSC: Propensity Score Matching New study for students whose “on time” graduation years were 2007, 2008 and 2009 Relied on very high level of cooperation w/ Accountability HWSC participants matched to nonparticipants by age, gender, race/ethnicity, poverty status, disability, English language learner status, grade, school quality, prior year GPA, prior year attendance, prior year suspensions, prior year state test scores
Inform & Empower CGR HWSC: Propensity Score Matching Grouped students in two ways By entry grade (8 th, 9 th, or 10 th ) & on-time graduation year (2007, 2008 or 2009) for NINE groups or “cohorts” Groups are more homogeneous “Graduation” has a consistent definition BUT the groups are smaller By enrollment year (02-03 through 06-07) across all grades for THREE cohorts HW-SC enrollment practices more consistent Groups are larger BUT graduation standards will vary
Inform & Empower CGR Propensity score matching complexity Considered many variations Matched 1:1, 1:3, & 1:5 RCSD student(s) to each HWSC student Studied on-time, on-time + 1yr graduation 2 probability distributions: logit v. probit 108 model “runs” (12 variants by 9 cohorts) 95% confidence interval: The true value will lie within the interval 95% of the time
Inform & Empower CGR Final statistical comments Statistical significance How often would this result occur by chance? 95% confidence interval: Given the size of the sample and an unbiased sampling procedure, the true “population parameter” will fall within this range 95 times out of % confidence interval: true “population parameter” will fall within this range 99 times out of 100 “Effect size” or importance of result