Technical Assistance on Evaluating SDGs: Leave No One Behind

Module 11 THE PROPOSED EFGR EVALUATION FRAMEWORK PART 2: REVIEW OF METHODOLOGIES
Technical Assistance on Evaluating SDGs: Leave No One Behind EvalGender+ Network together with UN Women, UNEG and EvalPartners

Presentation developed by Michael Bamberger and Asela Kalugampitiya based on Chapter 4 of ”Evaluating the Sustainable Development Goals within a “No-one left behind” lens through equity-focused and gender-responsive evaluations”

Outline The main types of evaluation design
Applications to summative (impact) evaluation Applications to policy evaluation Applications to program evaluation

1. Evaluation: Levels, types and methodologies
Levels of evaluation Policy Program Project Types of evaluation Formative Developmental Summative

Frequency of use of different types of evaluation at different level
Levels Policy Formative Developmental Summative ***** * Program *** Project

Main evaluation methodologies
Experimental and quasi-experimental designs Statistical designs Theory-based evaluations (including theory of change) Case study methods Qualitative and participatory methods Review and synthesis Complexity responsive evaluations

Frequency of application of different methodologies in each type of evaluation
Types of evaluation Policy Formative Developmental Summative Experimental and quasi-experimental * ***** Statistical *** Theory-based Case study Qualitative and participatory Review and synthesis Complexity responsive

A. Experimental and quasi-experimental designs
Experimental designs [RCT most common] Assessing the extent to which observed changes in an outcome variable can be attributed to a project intervention [statistical significance of difference] Random assignment to the project (treatment) and control groups Works better for simple project designs with few components and few outcomes. RCT must be used before project group is selected Ethical and political concerns with RCTs

Simple experimental design [RCT]
Before [T1] Intervention After [T2] Project group P1 X P2 Control group C1 C2 Random assignment

Quasi-experimental design [QED]
Similar design to RCT but random assignment is not possible and groups are matched Statistical matching is stronger Propensity score matching [PSM Judgmental matching is weaker QED designs can be used after project group has been selected Issue of sample selection bias

Quasi-experimental design [QED]
Before [T1] Intervention After [T2] Project group P1 X P2 Control group C1 C2 Matched samples MATCHING Statistical matching = stronger Judgmental matching = weaker

B. Statistical designs Statistical modeling and econometric analysis.
Mainly used at the national level to assess the impacts of policies or country-wide programs by comparing experiences with other countries Countries are matched on macro-indicators such as GDP growth, per capita income, investments in different sectors, rate of infrastructure construction. Useful for national programs where it is not possible to identify a comparison group (counter-factual) within the country

C. Theory-based evaluation
Approaches such as theory of change are used to explain the steps through which an intervention is intended to achieve its objectives. Identifies contextual, organizational and other factors affecting success Identifies the key assumptions on design, implementation strategies and potential constraints which must be tested A good theory of change should also focus on broader social, political, economic, legal and other factors affecting such Program impact and effectiveness can be assessed by comparing actual implementation experience with the TOC

Theory-based evaluation [continued]
Common weaknesses of theories of change TOC are often too vague to be capable of testing or refuting The do not include a rival hypothesis so it is not possible to test alternative explanations of the outcomes Time-line is often not specified Outcomes are too vague to be measurable

D. Case-based approaches
The case is taken as the unit of analysis A case may be an individual, household, community, organization, state or even a country A sample of cases are selected to: be representative of a total population To illustrate particular sub-groups (the most or least successful, outliers etc) The analysis can be qualitative and descriptive or quantitative Case-based methods can be used as stand-alone evaluations or to illustrate and explore groups identified in surveys.

Qualitative comparative analysis (QCA)
QCA approaches have become popular in recent years Usually a sample of cases is selected A matrix is used to describe the attributes of each case and whether particular outcomes were/were not achieved In the simplest form the matrix includes dichotomous (yes/no) variables e.g. household head completed primary school: Yes/No The analysis identifies: The set of attributes that are always present when an outcome is achieved The attributes that are present when an outcome is not achieved

QCA [continued] Important features of QCA analysis Identifies configurations (combinations) of factors that affect outcomes Recognizes that single variable analysis (i.e. regression analysis) oversimplifies reality Useful for the analysis of complex programs Can be used with small samples

E. Participatory and qualitative analysis
Involves a wide range of stakeholders in design, implementation and interpretation of the evaluation Can be used for methodological reasons or to support a rights-based/ empowerment approach Provides a deeper understanding of the lived-experience of different groups Gives voice to poor and vulnerable groups Analysis of processes Understanding contextual factors and processes of social control

Participatory and qualitative analysis [continued]
Uses a mixed methods approach combining a wide range of tools and techniques for data collection and analysis Participatory approaches are in-line with the human rights and gender equality evaluations proposed by UNEG

Examples of participatory and qualitative methods
Participatory consultative methods such as PRA Outcome mapping and outcome harvesting Most significant change Participant observation Key informant interviews Community consultations and some kinds of focus groups Longitudinal case studies

F. Review and synthesis studies
Identification of all evaluations conducted on a particular topic (e.g. the effects of micro-credit on women’s empowerment) Selection of studies that satisfy standards of methodological rigor Often only randomized control trials are acepted May incorporate a theory of change to structure findings Summary of findings and lessons that are statistically sound Provides a useful starting point for designing programs and indicates what kinds of outcomes can be expected – and what is not likely to be achieved

G. Complexity-responsive evaluations [see Session 6]
Not widely-used but important for EFGR as programs with a gender or equity focus frequently involve dimensions of complexity Complexity-responsive evaluations can largely use familiar evaluation tools described in this session but it is necessary to begin with a complexity diagnostic to understand why and how the program is complex Conclude by reassembling all of the individual evaluations to understand the big picture Complexity is important because often each program component has a positive rating but the overall program fails to achieve its broader objectives

Checklist for assessing the level of complexity of a program
The complexity checklist (Report Chapter 3 Table 2) is a useful tool to identify the level of complexity of: The program design and its components Relations among the different stakeholders and participating organizations (government, donor, civil society and community) The context within which the program operates The processes of causality and change This can help determine whether the program is sufficiently complex to require a complexity-responsive evaluation

Examples of how the checklist defines low and high complexity [Chapter 3 Table 2]
Low complexity High complexity Dimension 1: The intervention [1.4] Is the program design well tested and clearly defined? Well tested, used many times and clearly defined Relatively new and untested, and still not clearly defined Dimension 2: Institutions and stakeholders [2.3] The number and diversity of stakeholders Relatively few and with similar interests Many stakeholders and with diverse interests Dimension 3: The program context [3.1] Context dependency: how much is the program influenced by contextual factors? The program is relatively independent of the context The program is strongly influenced by contextual factors Dimension 4: The nature of causality [4.1] Causal pathways There are very few causal pathways and there is a clear relationship between inputs and outcomes [linear causality] There are multiple causal pathways and it is not possible to trace direct relationships between inputs and outcomes [non-linear causality]

The “unpacking” approach to complexity
Chapter 4 Section D presents a 4-step approach for the evaluation of complex programs: Step 1: holistic analysis to understand the nature and dimensions of complexity Step 2: unpacking the program into its main components Step 3: selecting the appropriate evaluation methodology Step 4: Reassembling the findings of the individual component evaluations to understand how program performance and outcomes is affected by the real- world context within which it operates.

2. Applications of the evaluation tools to policy and program evaluation
Most of the evaluation tools described in section 1 were originally developed for project evaluations and all can be used at the project level All of the tools can also be applied at the policy and program evaluation levels, but their application can often be more difficult as the evaluation setting is often more difficult due to: Lack of a comparison group difficult to collect good quantitative data There is often no clear theory of change Complete information on design and implementation is often more difficult to obtain Longer implementation period

Application of the evaluation tools to policy evaluation
Assessing the implementation and outcomes of policies which take many years to implement and which often change in response to the changing political environment A major challenge is that policies can take four or more years to be implemented and before effects can be assessed. Most evaluations have to be completed in a much shorter period Many policy evaluations are retrospective and are based on Key informant interviews Focus groups Review of policy and planning documents Review of coverage in the media

Many evaluations are based on a theory of change which is often constructed by the evaluator as not such document existed Difficult to identify a counterfactual [what would have been the situation if the policy had not been implemented]. Two options: Compare with other similar countries using qualitative comparisons or statistical (econometric) analysis. Use a pipeline design to compare areas of the country where the policy has been implemented with areas where implementation was delayed [intentionally or due to unforeseen circumstances]

Application of the evaluation tools to program evaluations
Programs frequently comprise a number of different components, each of which can be evaluated separately using the standard evaluations tools The complexity ‘unpacking” strategy can then be used to reassemble the individual components and to see the big picture (See section 1) It is usually necessary to use qualitative techniques such as key informant interviews to understand issues such as program management, coordination etc A counterfactual analysis can sometimes be conducted using the pipeline designs mentioned in section

Evaluation tools for program evaluation
Qualitative and participatory techniques: key informants, focus groups and sometimes participant observation Case studies. QCAs are a potentially powerful tool A theory of change (or other theory-based method) is recommended to develop a framework for most program evaluations RCT and QED designs in cases where a comparison-group design (counterfactual) can be used Systematic reviews can help identify issues to address in the evaluation Complexity-responsive designs will often be required

3. Principles for integrating EFGR evaluations into the SDGs
Begin with a review of lessons learned from past approaches and evaluations build gender and equality into the theory of change and results framework Develop a checklist of areas where EFGR principles and indicators can be integrated into the evaluation process Begin with rapid diagnostic studies to help understand the EFGR issues which should be addressed Integrate gender and equality into ongoing or planned evaluations

Technical Assistance on Evaluating SDGs: Leave No One Behind

Similar presentations

Presentation on theme: "Technical Assistance on Evaluating SDGs: Leave No One Behind"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Technical Assistance on Evaluating SDGs: Leave No One Behind

Similar presentations

Presentation on theme: "Technical Assistance on Evaluating SDGs: Leave No One Behind"— Presentation transcript:

Similar presentations

About project

Feedback