Building a Strong Outcome Portfolio

Building a Strong Outcome Portfolio
Section 4: What Researchers Actually Do Jeffrey A. Butts, Ph.D. Research and Evaluation Center John Jay College of Criminal Justice City University of New York September 2018

What it Takes to Build Evidence…
Evaluation Strategies:  Process: Did we do What was Planned & Intended?  Outcome: Did we See the Changes we Hoped to See?  Impact: Can we Claim to Have Caused Those Changes?

Individual-level measurement of:  Inputs: Services, activities, program efforts  Outputs: Service participation, activities completed  Outcomes: Youth behaviors and accomplishments

Service Provision Service Participation Behaviors/Accomplishments X X X X X X X X X X + – + – + + – X X + – + – X X X X X X – – – X X X X X X X X X X X – + – – X X + – – X X X + – + + – – + + –

Designing Your Evaluation
A design must fit the local context and situation Experimental designs are preferred, but rarely feasible Instead, choose the most rigorous, realistic design Key stakeholders should be involved early, to solicit their views and to gain their support for the eventual design Criticisms should be anticipated and dealt with early

Analyze Differences, Effect Size, etc. Random Assignment Process
Experimental, Random-Assignment Data Collection Time Begin Services OUTCOMES Client Referrals Treatment Group Analyze Differences, Effect Size, etc. Eligibility for Randomization Determined by Evaluators or Using Guidelines From Evaluators Random Assignment Process Control Group Time Data Collection No Services or Different Services Equivalent Data Collection? No Services Group? How to Randomize? Eligibility? Issues:

Quasi-Experimental – Matched Comparison
Data Collection Time Client Referrals Treatment Group OUTCOMES Matching Process Analyze Differences, Effect Size, etc. According to: - sex, race, age - prior services - scope of problems - etc. Pool of Potential Comparison Cases Time Data Collection Comparison Group Control Services to Comparison Cases? Equivalent Data Collection? Comparison Cases? Matched on What? Issues:

Data Collection Points
Quasi-Experimental Design – Staggered Start Data Collection Points X OUTCOMES Group 1 Group 2 Group 3 Client Referrals X OUTCOMES X OUTCOMES X Intervention Time Requires More Time and Program Cooperation Issues:

Process Evaluation “Now that this bill is the law of the land, let’s hope we can get our government to carry it out.” - President John F. Kennedy * Quoted in Rossi et al., p. 170

Avoiding the “Black Box” Problem
CAR WASH Black Box

Process Evaluation Describes how a program is operating at a specific moment (or over a specific time period) Assesses how well a program performs its intended functions Identifies the critical components, functions and relationships necessary for the program to be effective Does not estimate outcomes or impact

Process Evaluation Many steps required to take a program from concept to full operation Much effort is needed to keep it true to its original design and purposes Whether any program is fully carried out as envisioned by its sponsors and managers is always problematic

Process Evaluation Serves Multiple Purposes
Establish program crediblity Feedback for managerial purposes Demonstrate accountability to sponsors or decisionmakers Provide a freestanding process evaluation Augment an impact evaluation Key Words - Appropriate - Adequate - Sufficient - Satisfactory - Reasonable - Intended

Process Evaluation is Built on Theory
What “should” the program be doing? Do critical events or service activities take place? - Often a matter of degree rather than all or none - Quality and appropriateness count too Stakeholders should be deeply involved in defining program theory and, therefore, process evaluation goals Werner, Alan. A Guide to Implementation Research. 2004, p

Outcome Evaluation An outcome is the state of the target population or the social conditions that a program is expected to have changed. Outcomes are observed characteristics of the target population or social conditions, not of the program, and the definition of an outcome makes no direct reference to program action Must relate to benefits products and services might have, not simply their receipt (Rossi et al., pp )

Outcome Evaluation Challenge for evaluators is to assess not only outcomes but degree to which a change in outcomes may be attributable to a program or policy Outcome Level is that status of an outcome at some point in time (e.g., the amount of smoking among teenagers). Outcome Change is the difference between outcome levels at different points in time or between groups Program Effect is the portion of a change in outcome that can be attributed to a program or policy rather than to other factors

Answering the Question, “Did it Work?”
Evaluation is an effort to test the effects of a social program or policy change We try to intervene in some problem or condition, and then review our success What is the “effect” of the program or policy, and compared to what? Statistical significance is a limited concept for this purpose.

Statistical Significance
Statistical significance is not a direct indicator of the size of the program or policy effect Statistical significance is a function of - sample size - effect size - p level A study’s ability to detect a difference is “Power” Even a well-designed study can end up having low power… the program effect may be there but the study can’t see it Partially adapted from James Neill (2007). Why use Effect Sizes instead of Significance Testing in Program Evaluation? ( 18

Score of Treatment Group 710 Score of the General Population 675 1000 900 800 700 600 500 400 300 200 Difference Is it significant? Is it important? Is it worth the cost? 19

Score of Treatment Group 710 Score of the General Population 675 1000 900 800 700 600 500 400 300 200 Difference Is it significant? Is it important? Is it worth the cost? 20

Statistical significance is BINARY Significant or “not significant” No way to say “how significant” or how much better a treatment was than a control group p values are set in advance as a yes/no test p < .01 is NOT “more significant” than p < .05 21

Effect Size A more flexible measure of program effect
Effect size statistics account for the amount of variance in both the treatment-group and the control-group Effect size is often stated in terms of percentages - the program accounted for 20% of the change - treatment reduced drug use 15% more than expected - probation was responsible for 25% of the improvement Does not require measures from the general population Is more easily applied to policy questions (Rossi et al., Chapter 10) 22

Planning Process to Create Evidence Base
Create Priority List of Topics and Questions Open and inclusive brain-storming session Identify uncertainties and unknowns Group topics into areas (cost, impact, equity, etc.) Sort List by Difficulty Easiest administrative data already exist, no client contact required, outcomes measurable using one agency data system alone Hardest have to create new data, data only available with client contact, follow-up period required (e.g. recidivism), data required from multiple agencies

Contact Jeffrey A. Butts, Ph.D.
Director, Research and Evaluation Center John Jay College of Juvenile Justice City University of New York 524 W. 59th Street, Suite BMW605 New York, NY 10019

Building a Strong Outcome Portfolio

Similar presentations

Presentation on theme: "Building a Strong Outcome Portfolio"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Building a Strong Outcome Portfolio

Similar presentations

Presentation on theme: "Building a Strong Outcome Portfolio"— Presentation transcript:

Similar presentations

About project

Feedback