Presentation on theme: "Program Evaluation. Evaluation Defined Green and Kreuter (1999) broad definition: “comparison of an object of interest against a standard of acceptability”"— Presentation transcript:
Evaluation Defined Green and Kreuter (1999) broad definition: “comparison of an object of interest against a standard of acceptability” Weiss (1998) more targeted: “systematic assessment of the operation and/or the outcomes of a program or a policy, compared to a set of explicit or implicit standards, as a means of contributing to the improvement of the program or policy”.
Fournier (2005) “Evaluation is an applied inquiry process for collecting and synthesizing evidence that culminates in conclusions about the state of affairs, value, merit, worth, significance, or quality of a program, product, person, policy, proposal, or plan.”
Program evaluation A tool for using science as a basis for : – Ensuring programs are rational and evidence-based Needs assessed Theory-driven Research-based – Ensuring programs are outcome-oriented Forces goals and objectives at the outset – Ascertaining whether goals and objectives are being achieved Performance measures established at outset
Program evaluation A tool for using science as a basis for : – Informing program management about Program processes – adjusted, improved Program quality – effectiveness (see goals and objectives) Program relevance – Decision-making and action e.g. policy development based on program evaluations – Transparency and accountability Funders, participants, and other stakeholders.
Program evaluation Not done consistently in programs Often not well-integrated into the day-to-day management of most programs
What gets measured gets done If you don’t measure results, you can’t tell success from failure If you can’t see success, you can’t reward it If you can’t reward success, you’re probably rewarding failure If you can’t see success, you can’t learn from it If you can’t recognize failure, you can’t correct it. If you can demonstrate results, you can win public support. Re-inventing government, Osborne and Gaebler, 1992 University of Wisconsin- Extension-Cooperative Extension Source: University of Wisconsin- Extension-Cooperative Extension FROM Logic Model presentation: The accountability era
Within an organization – evaluation... Should be designed at the time of program planning Should be a part of the ongoing service design and policy decisions – Evidence that actions conform with strategic directions, community needs, etc – Evidence that money spent wisely Framework should include components that are consistent across programs – In addition to indicators and methods tailor-made for specific programs and contexts Extent of evaluation – related to the original goals – related to complexity of the program
When not to evaluate (Patton, 1997) There are no questions about the program Program has no clear direction Stakeholders can’t agree on program objectives Insufficient funds to evaluate properly
(Patton, 2005) Basic ResearchEvaluation PurposeDiscovery of new knowledge Inform decisions, clarify options, reduce uncertainties, and provide information about programs and policies Adding to an existing body of knowledge Study of the effectiveness with which existing knowledge is used to inform and guide practical action Conclusions have empirical aspect Conclusions encompass both an empirical aspect (that something is the case) and a normative aspect (judgment about the value of something). Assesses meritAssesses both merit (absolute quality) & worth Aimed at truthAimed at action MethodsTheory-drivenTheory-driven though pragmatic concerns outweigh theoretical considerations when selecting methods Some form of experimental research design is essential Experimental research design not essential
Merit and Worth Evaluation looks at the merit and worth of an evaluand (the project, program, or other entity being evaluated) Merit is the absolute or relative quality of something, either intrinsically or in regard to a particular criterion Worth is an outcome of an evaluation and refers to the evaluand’s value in a particular context. This is more extrinsic. Worth and merit are not dependent on each other.
Merit and Worth A medication review program has merit if it is proven to reduce known risk for falls – It also has value/worth if it saves the health system money An older driver safety program has merit if it is shown to increase confidence among drivers over 80 years of age – Its value is minimal if it results in more unsafe drivers on the road and increases risk and cost to community at large.
Evaluation vs research In evaluation, politics and science are inherently intertwined. – Evaluations are conducted on the merit and worth of programs in the public domain which are themselves responses to prioritized needs that resulted in political decisions – Program evaluation is intertwined with political power and decision making about societal priorities and directions (Greene, 2000, p. 982).
Formative evaluation Purpose: ensure a successful program. Includes: 1.Developmental Evaluation (pre-program) Needs Assessment – match needs with appropriate focus and resources Program Theory Evaluation / Evaluability Assessment – clarity on theory of action, measurability, against what criteria – Logic Model – ensures aims, processes and evaluations linked logically Community/organization readiness Identification of intended users and their needs etc
Surveillance, Planning and Evaluating for Policy and Action: PRECEDE- PROCEED MODEL* Quality of life Phase 1 Social assessment Health Phase 2 Epidemiological assessment Health education Policy regulation organization Public Health Phase 5 Administrative & policy assessment OutputLonger-term health outcome Short-term social impact Short-term impact ProcessInput Long-term social impact Phase 6 Implementation Phase 7 Process evaluation Phase 8 Impact evaluation Phase 9 Outcome evaluation Predisposing Reinforcing Enabling Phase 4 Educational & ecological assessment Behavior Environment Phase 3 Behavioral & environmental assessment *Green & Kreuter, Health Promotion Planning, 4 th ed, 2005.
Formative evaluation Purpose: ensure a successful program 2.Process Evaluation– all activities that evaluate program once running Program Monitoring – Implemented as designed or analysing/understanding why not – Efficient operations – Meeting performance targets (Outputs in logic model)
Summative evaluation Purpose: determine program success in many different dimensions Also called- Effectiveness evaluation Outcome/Impact evaluation Examples – Policy evaluation – Replicability/exportability/transferability evaluation – Sustainability evaluation – Cost-effectiveness evaluation
Evaluation Science Social research methods Match research methods to the particular evaluation questions – and specific situation Quantitative data collection involves: – identifying which variables to measure – choosing or devising appropriate research instruments reliable and valid – administering the instruments in accordance with general methodological guidelines.
Experimental Design in Evaluation Randomized controlled trial (RCT) – Robust science, internal validity 1.Pre/post-test with equivalent group R O 1 X O 2 R O 1 O 2 2.Post-test only with equivalent group R X O 2 R O 2 Problems with natural settings: Randomization Ethics Implementation not controlled (staff, situation) Participant demands Perceived inequity between groups etc
Experimental Design in Evaluation Quasi-experimental design – Randomization not possible: Ethics Program underway No reasonable control group 1.One group post-test X O 2 Weakest design so use for exploratory, descriptive Case study. Not for attribution. 2.One group pretest-posttest O 1 X O 2 Can measure change Can’t attribute to program 3.Pre-post non-equivalent (non-random) groups – good but must Construct similar comparators by (propensity) matching individuals or group N O 1 X O 2 N O 1 O 2
Evaluation Methods (Clarke and Dawson, 1999) Strict adherence to a method deemed to be ‘strong’ may result in the wrong problems becoming the focus of the evaluation – purely because the right problems are not amenable to analysis by the preferred method Rarely only one method used – Require range to ensure depth and detail from which conclusions can be drawn
Experimental Design in Evaluation Criticism of experimental design in evaluation – Program is a black box ED measures causality (Positivist) Does not capture the nature of causality (Realist) – Internal dynamics of program not observed How does the program work? – Theory helps explain What are the characteristics of those in the program? – Participants need to choose to make a program work – Right conditions are needed to make this possible Clark and Dawson, 1999 – What are unintended outcomes/effects of the program?
Naturalistic Inquiry - Qualitative design Quantitative (ED) offers little insight into the social processes which actually account for the changes observed Can use naturalistic methods to supplement quantitative techniques (mixed methods) Can use fully naturalistic paradigm – Less common
Naturalistic Inquiry – Interpretive: People mistake their own experiences for those of others. So…. Emphasis on understanding lived experiences of (intended) program recipients – Constructivist: Knowledge is constructed (not discovered by detached scientific observation). So… Program can only be understood within natural context – How being experienced by participants, staff, policy makers – Can’t construct evaluation design ahead of time “don’t know what you don’t know” – Theory is constructed from (grounded in) data
Evaluation Data Quantitative Qualitative Mixed Primary Secondary One-off surveys, data pulls Routine monitoring Structured Unstructured (open-ended)
Data Collection for Evaluation Questionnaires – right targets – carefully constructed: capture the needed info, wording, length, appearance, etc – analysable Interviews (structured, semi-, un-) – Individuals – Focus groups Useful at planning, formative and summative stages of program
Data Collection for Evaluation Observation – Systematic Explicit procedures, therefore replicable Collect primary qualitative data – Can provide new insights drawing attention to actions and behaviour normally taken for granted by those involved in program activities and therefore not commented upon in interviews – Circumstances in which it may not be possible to conduct interviews
Data Collection for Evaluation Documentary – Solicited e.g. journals/diaries – Unsolicited e.g. meeting minutes, emails, reports – Public e.g. organization’s reports, articles in newspaper/letters – Private e.g. emails, journals