Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation for M&E and Program Managers

Similar presentations


Presentation on theme: "Evaluation for M&E and Program Managers"— Presentation transcript:

1 Evaluation for M&E and Program Managers
R&M Module 3

2 Introduction

3 To the Participants/ Expectations:
Your name, organization, and position? Why did you sign up for this workshop? How do you hope this course will help you in your job? What areas of evaluation are you most interested in learning about?

4 Course Objective To give M&E and program managers the background needed to effectively manage evaluations.

5 Learning Outcomes To understand the principles and importance of evaluation. Learn basic evaluation design and methods for data collection commonly used in field-based evaluation. Learn key considerations in planning and managing evaluations in the field. Construct a learning agenda and evaluation plan for your grant. Learn key steps in ensuring effective communication and utilization of evaluation findings. Develop a terms of reference/evaluation plan.

6 Course Outline Day Topic Day 1 Chapter 1: What is Evaluation?
Chapter 2: Evaluation Purpose and Questions Day 2 Chapter 3 : Overview of Evaluation Design and Methods Day 3 Chapter 4: Data Sources and Collection Methods Chapter 5: Sampling Day 4 Chapter 6: Basic Data Analysis Chapter 7: Using and Communicating the Results Day 5 Chapter 8: Managing the Evaluation Presentations of TOR/Evaluation Plans

7 Learning for Pact and Learning Principles
Solicit daily feedback for ongoing feedback Asking is learning! Open environment for asking questions throughout the presentation. Let us know if there are areas/items/considerations missing Learning by doing Peer learning through development of TOR (and presentations on the last day)

8 Session 1 What is Evaluation?

9 Session Learning Objectives:
Define evaluation Explain the difference between monitoring, evaluation, and research Describe why evaluations are conducted Describe different types of evaluations Describe common barriers to evaluations

10 Session 1.1 Defining Evaluation

11 What is Evaluation? Participants should now brainstorm their definition of evaluation, or concepts they think link to evaluation

12 What is Evaluation? “The systematic collection of information about the activities, characteristics, and results of programs to make judgments about the program, improve or further develop program effectiveness, inform decisions about future programming, and/or increase understanding.” -Michael Patton (1997, 23)

13 What is Evaluation? “The systematic and objective assessment of an ongoing or completed project, programme or policy, its design, implementation and results. The aim is to determine the relevance and fulfillment of objectives, development efficiency, effectiveness, impact, and sustainability.” -Organization for Economic Co-operation and Development (OECD; 2002, 21–22)

14 What is Evaluation? “The systematic collection and analysis of information about the characteristics and outcomes of programs and projects as a basis for judgments, to improve effectiveness, and/or inform decisions about programming.” -US Agency for International Development (2011, 2)

15 What is Evaluation? Systematic- grounded in a system, method, or plan
Specific- focused on a project/program Versatile- ask many different types of questions Utility- used to inform current and future programming Systematic First, evaluation is systematic. In other words, evaluation is grounded in a system, method, or plan. To arrive at credible conclusions, a high-quality evaluation uses consistent methods that are clearly outlined in the evaluation design. In this way, evaluation is closely related to research, and both research and evaluation use many of the same tools. Specific Next, evaluation is specific to a program or project. This is what distinguishes evaluation from research. For example, someone might investigate whether children who live near the garbage dump get sick more often than children who live far from the dump. This is research, but it is not evaluation. Another person could study whether children who attend a certain nutrition program get sick less often. Both studies are research, but only the second example is specific to a project; thus, only the second example is an evaluation. Versatile The three definitions also show that evaluation can answer many different types of questions—and knowing what questions to ask is always important to an evaluation manager. For instance, an evaluation may ask: Did the program improve the well-being of community residents? Were resources used effectively? What factors were most important to the success of the intervention? Why did the program fail? Utility Evaluations should be used to inform current or future programming. Very much grounded in reality and around how stakeholders will use and apply findings.

16 Evaluation is Action-Oriented
Evaluation seeks to answer a range of questions that might lead to adjustment of project activities, including: Is the program addressing a real problem, and is that problem the right one to address? Is the intervention appropriate? Are additional interventions necessary to achieve the objectives? Is the intervention being implemented as planned? Is the intervention effective and resulting in the desired change at a reasonable cost?

17 Evaluation Versus Research Evaluation Versus Monitoring
Session 1.2 Evaluation Versus Research Evaluation Versus Monitoring

18 Evaluation vs Research
What is an example of research? What is an example of an evaluation? What are main differences between evaluation and research?

19 Evaluation vs Research
FACTOR RESEARCH EVALUATION Purpose To add to knowledge in the field, develop laws and theories To make judgments, provide information for decision making Who sets the agenda or focus? Researchers, academic institutions Stakeholders and evaluator jointly Generalizability of results Important to add to theory/contribute to field Less important; focus is on particulars of the program or policy and the context Intended use of results Usually for publication and knowledge sharing Usually will directly affect the project’s activities or decisions of stakeholders in development of future projects (Fitzpatrick, Sanders, and Worthen, 2011).

20 Monitoring vs Evaluation
What are some key differences? Monitoring is primarily intended to provide information about a project’s operations and outputs. Evaluation generally looks at a project on a broader level, assessing whether it is meeting strategic goals. Sometimes, monitoring data can be useful in evaluations, and some evaluations are primarily oriented toward assessing operations or process. Consequently, monitoring and evaluation are often linked and can be complementary.

21 Differences between Monitoring and Evaluation
Subject Usually focused on strategic aspects Addresses operational management issues Character Flexible subject and methods Systematic Frequency Periodic Continuous Primary client Stakeholders and external audience Program management Party conducting Can be external or internal Internal Methodology Rigorous research methods, sophisticated tools Rapid appraisal methods Primary focus Focus on relevance, outcomes, impact and sustainability Focus on operational efficiency Objectives To determine outcomes or impact, verify developmental hypothesis, and document successes and lessons To identify and resolve implementation problems and assess progress towards objectives (adapted from Jaszczolt, Potkański, and Alwasiak 2003)

22 Session 1.3 Why Evaluate?

23 Exercise: Why Evaluate?
Take a few minutes to reflect on what evaluation means to your organization based on your experience. What would you consider as the key reasons why your organization should invest in undertaking evaluation? Write down three points you might say to someone else to explain why evaluation is important to your project. Notes: 1st- Ask participants to break into organizations where they come from 2nd- Debrief the activity with reflection from few participants on a plenary discussion 3rd- Summarize the need for evaluation with the two slides coming after this

24 Why Evaluate? Common reasons to evaluate are:
To measure a program’s value or benefits To get recommendations to improve a project To improve understanding of a project To inform decisions about future projects

25 Session 1.4 Types of Evaluations

26 Types of Evaluation Once you determine your evaluation purpose, you will want to consider what type of evaluation will fulfill the purpose Broad types are: Formative evaluation Summative evaluation Process evaluation Outcome evaluation Impact evaluation

27 Formative Evaluations
Undertaken during program design or early in the implementation phase to assess whether planned activities are appropriate. Examine whether the assumed ‘operational logic’ corresponds with the actual operations and what immediate consequences implementation may produce. Sub-types include: needs assessments, contextual scans, feasibility assessments, and baseline assessments

28 Summative Evaluations
Final assessments done at the end of a project. Results obtained help in making decisions about continuation or termination or whether there would be value in scaling up or replicating the program. A summative evaluation determines the extent to which anticipated outcomes were produced. This kind of evaluation focuses on the program‘s effectiveness, assessing the results of the program.

29 Process Evaluations Examines whether a program has been implemented as intended—whether activities are taking place, whom they reach, who is conducting them, and whether inputs have been sufficient. Also is referred to “implementation evaluation.”

30 Formative and Summative Evaluations
"When the cook tastes the soup, that's formative; when the guests taste the soup, that's summative" Bob Stake, quoted in Scriven, 1991

31 Outcome Evaluations An outcome evaluation examines a project’s short-term, intermediate, and long-term outcomes. While process evaluation may examine the number of people receiving services and the quality of those services, outcome evaluation measures the changes that may have resulted in people’s attitudes, beliefs, behaviors, and health outcomes. Outcome evaluation may also study changes in the environment, such as policy and regulatory changes.

32 Impact Evaluations  Assess the long term effect of the project on its end goals, i.e. disease prevalence, resilience, or stability. The most rigorous types of outcome evaluations. Use statistical methods and comparison groups to attribute change to a particular project or intervention. 

33 Linking Types of Evaluations to the Program Logic Model
Process Evaluation Baseline/ Formative Midterm Endline Summative Outcome Impact Presentor should note that some evaluation types can overlap. For example, and endline/summativeevaluation can overlap with process evaluation. Also, its important to note that some endline/summative evaluations can be measuring outcomes and impacts. Inputs Activities Output Outcome Impact

34 Exercise: Choosing Evaluation Type
Identify which evaluation type is the most appropriate in each situation: You are interested in knowing whether the project is being implemented on budget and according the workplan. You are interested in knowing what the effect of the project has been. You want to determine whether to scale up project activities. You are about to begin project activities, but want to determine whether the proposed activities fit with the current context.

35 Internal and External Evaluations
Who conducts the evaluation? Internal evaluations are conducted by people within the implementing organization. External evaluations are conducted by people not affiliated with the implementing organization (although conducted closely with project staff). Hybrid evaluations are conducted with representatives from both External Internal Hybrid May be more objective May be more efficient or nuanced due to internal understanding May be useful in sensitive situations May have more expertise in evaluation methods May be more directly suited to project needs May serve as a learning opportunity What else?

36 Barriers to Evaluations
Session 1.5 Barriers to Evaluations

37 Exercise: Barriers to Evaluation
Many programs are not properly evaluated, and as a result it is very difficult to duplicate or scale up the interventions that are most effective. What are some of the barriers that prevent evaluation of programs? To what extent do you believe that program implementers are open to evaluating their programs and what are some of the underlying reasons for this? Write down five common barriers to evaluation. Instructions: 1st- Break the participants into groups of 6 – 7 participants 2nd- Ask the participants to complete the task as per the questions on the slide 3rd- Advise groups to capture their response on a flipchart for group presentation and to share and discuss common barriers

38 A Few Barriers: Lack of time, knowledge and skills
Lack of resources or budget Poor design or planning Startup activities compete with baselines Too complex evaluation design Fear of negative findings Resistance to M&E as police work Resistance to using resources for M&E Perception that evaluations are not useful Disinterest in mid or endline evaluations if no baseline What else? This course can give you tools to overcome some, though perhaps not all, of these barriers through understanding and being able to articulate the importance of evaluation, planning evaluations and resources well, and understanding some basic evaluation concepts

39 Evaluation Purpose and Questions
Session 2 Evaluation Purpose and Questions

40 Session Learning Objectives:
Use logic models to explain program theory. Write an evaluation purpose statement. Develop evaluation questions To begin to build terms of reference, participants will: Describe the program using a logic model (TOR I-B and II-C). Complete a stakeholder analysis (TOR II-A). Write an evaluation purpose statement (TOR II-B). Develop evaluation questions (TOR III).

41 Terms of Reference (TOR)
Comprehensive, written plan for the evaluation. Articulates the evaluation’s specific purposes, the design and data collection needs, the resources available, the roles and responsibilities of different evaluation team members, the timelines, and other fundamental aspects of the evaluation. Facilitates clear communication of evaluation plans to other people. If the evaluation will be external, the TOR helps communicate expectations to and then managing the consultant(s). A TOR template can be found in Appendix 1 (page 89).

42 Evaluation Plan/TOR Roadmap
Background of the evaluation Brief description of the program Purpose of the evaluation Evaluation questions Evaluation methodology Evaluation team Schedule and logistics Reporting and dissemination plan Budget Timeline Ethical considerations Discuss the components of the TOR and the process that will be taken to get there.

43 Focusing the Evaluation
Session 2.1 Focusing the Evaluation

44 Focusing the Evaluation
Focusing an evaluation means determining what it’s major purpose is. Why do we have to do this? Usually a broad range of interests and expectations among stakeholders Evaluation resources and budgets are usually limited, so not everything can be evaluated Evaluations must focus on generating useful information; plenty of interesting information is not particularly useful In an evaluation report, the results must come across clearly so they can be actionable What may happen if we don’t do this?

45 Key Steps in Focusing the Evaluation
Use a logic model to understand and document the program logic. Document key assumptions underlying the program logic. Engage stakeholders to determine what they need to know from the evaluation. Write a purpose statement for the evaluation. Develop a realistic set of questions that will be answered by the evaluation.

46 Logic Model Visually describe the program’s hypothesis of how project activities will create impact. Useful in distilling the program logic into its key components and relationships. Results frameworks, logframes, theories of change, and conceptual models, also facilitate visualization of program logic. Start with describing the project’s long term goals, and work towards the left.

47 Logic Model Program logic—also called the program theory—is the reasoning underlying the program design. The logic can often be expressed with if–then statements By making explicit the assumptions behind a program, it becomes easier to develop good evaluation questions.

48 Logic Model If we give people bednets, then they will use them over their beds. Or If we educate 60% of adults about mosquito prevention, then the mosquito population will decline.

49 Logic Model Inputs Activities Outputs Outcomes Impacts
A format commonly used in international development highlights five components that break down the change expected to result from the program into typical stages (Figure 2, next page). Inputs: Resources needed for the program (e.g., personnel, participants, money, supplies, and relationships). Activities: Processes or actions that turn inputs into outputs—in other words, what the staff does on the job (e.g., attend trainings, seminars, and meetings, and undertake renovations and construction). Outputs: Immediate results resulting from the activities, often measured by the quantity and quality of outputs (e.g., the number of condoms distributed, the number of people reached through a campaign, and the number of counseling sessions provided, and patient satisfaction with the counseling sessions). Outcomes: The intermediate results of the program. Changes in community behavior and attitudes could be among the outcomes of an outreach campaign, for example. Impacts: The program’s long-term effects, usually achieved over several years of program implementation.

50 Key Assumptions Every program has assumptions
Use logic model can help to make those clear. For example: a logic model could show an expected output of administering vaccinations to 2,000 children and an outcome of fewer children getting sick as a result. What are the underlying assumptions here? Many such assumptions comprise the logic model’s linkages. For example, we assume that if people attend a training, they have will greater knowledge of a subject and change their behavior. However, some assumptions exist outside the logic model’s main theory of change. We may make assumptions, for example, about environmental conditions, and a change in those conditions changes the outcome, as when war breaks out during implementation of a program that was designed for a time of peace. The assumptions can be noted in a side box on the logic model or in an associated document.

51 TOR Exercise: Program Background
If you are working with a group, please break out into your group. Describe the program using a logic model using the TOR I-B and II-C template. Spend approximately 30 minutes completing this activity. Be prepared to share with the larger group.

52 Engaging Stakeholders in Evaluation
Session 2.2 Engaging Stakeholders in Evaluation

53 Stakeholder Groups Who are some common stakeholders?
People involved in program operations (eg. staff, partners, funders) People served by or affected by the program (eg. clients, community members, officials) People who intent to use the evaluation results (eg. staff, funders, general public)

54 Participatory Evaluation
What might be some advantages to involving program clients/beneficiaries in key evaluation processes? Better understanding of client perspectives Organizations more accountable to beneficiaries Encourage an trust and transparency Cultivates evaluative thinking and ongoing learning Help clarify indicators and stimulate innovative ways of measurement May lead to more participatory decision making

55 The Personal Factor Research has demonstrated the value of the personal factor, and a key component to successful use of evaluations is: “the presence of an identifiable individual or group of people who personally care about the evaluation and the finding it generates. Where such a person or group was present, evaluations were used; where the personal factor was absent, there was a correspondingly marked absence of evaluation impact” (Patton 1997, 44).

56 How to involve stakeholders
What are some ways to meaningfully engage stakeholders?

57 Common Pitfalls to Involving Stakeholders
Making yourself (or evaluator) the primary stakeholder Identifying vague, passive audiences Automatically assuming the funder is the primary stakeholder Waiting until the evaluation is finished to identify its users

58 Exercise: Engaging Stakeholders
Conducting a stakeholder analysis is the optimal beginning. To begin developing a stakeholder analysis, ask: Identify the different stakeholders mentioned in the case study Who among these stakeholders should be involved in the program evaluation? Consider reasons why the stakeholders you listed should be involved in the evaluation. How might they use or be affected by the evaluation’s results? What would be their role in the evaluation? Complete the template provided and prepare to provide feedback in plenary. Notes: Debrief this exercise with the plenary discussion after the groups presentation

59 Exercise: Engaging Stakeholders
WHO AMONG THESE STAKEHOLDERS SHOULD BE INVOLVED IN THE PROGRAM EVALUATION? HOW MIGHT EVALUATION RESULTS AFFECT OR BE USED BY THE STAKEHOLDER? WHAT WOULD BE THE STAKEHOLDER’S ROLE IN THE EVALUATION? Should be involved (Yes / No) Reasons for the listed stakeholder to be involved

60 Stakeholder Analysis Matrix
The template used in is a common type of stakeholder analysis matrix. It: Identifies key stakeholders, Assesses their interests, And considers how those interests affect the project. Considering stakeholders and their interests systematically helps ensure that evaluation results are useful and used.

61 Stakeholder Analysis Matrix
Power/Interest Matrices classify stakeholders based on the power they hold and how likely they are to be interested in the evaluation results. The Involved High Interest/Low Power The Crowd Low Interest/Low Power The Players High Interest/High Power The Context Setters Low Interest/High Power Another type of stakeholder matrix.

62 TOR Exercise: Stakeholder Analysis
If you are working with a group, please break out into your group. Using the template provided in the TOR (II-A), complete the following tasks for your program: Identify the different stakeholders Identify what they want to know Consider why the evaluation is important for your stakeholders Identify how they will be involved in the evaluation Spend approximately 45 minutes completing this activity. Be prepared to share with the larger group. Can be homework if no class time

63 Writing an Evaluation Purpose Statement
Session 2.3 Writing an Evaluation Purpose Statement

64 Evaluation Purposes Should be developed after program logic clear and stakeholders are engaged. A clear and well-written purpose statement is important in clarifying the aim that the statement Key questions to be addressed in the purpose statement are: What will be evaluated? Why are we conducting the evaluation? How will the findings from the evaluation be used?

65 Evaluation Purposes Another way to write the purpose statement is to complete the blanks in the following sentence: “We are conducting an evaluation of ___________________(name of program) to find out ______________________ and will use that information in order to _____________________________________.”

66 TOR Exercise: Evaluation Purposes
If you are working with a group, please break out into your group. Develop an evaluation purpose statement for your evaluation using TOR template (I-C). Spend approximately 20 minutes completing this activity. Be prepared to share with the larger group.

67 Session 2.4 Evaluation Questions

68 Evaluation Questions Versus Purpose
What is the difference between the evaluation purpose and evaluation questions? Questions will fulfill the purpose Further articulate what specifically the evaluation will answer. Evaluation questions are different from a question in a survey/instrument. Evaluation questions are higher level, seeking to answer broader questions about the project.

69 Steps to Develop Evaluation Questions
Review the original program goals and objectives. Ensure that the evaluation is relevant, be sure you know what is important to the organization and to other stakeholders who might use the evaluation—their priorities and needs. Consider the timing of the evaluation. Some questions are best asked at the beginning of the program, others at the end of the program Develop a list of potential evaluation questions. Decide which questions are most important. Focus on the questions for which you need answers, not those on questions whose answers would be nice to know. Keep this list short (3-6) in order to ensure that the evaluation remains focused on the most important issues. Review that the questions should be answerable and realistic given the resources available. Review the original program goals and objectives. The questions should relate to these. Next, to ensure that the evaluation is relevant, be sure you know what is important to the organization and to other stakeholders who might use the evaluation—their priorities and needs. Consider the timing of the evaluation. Some questions are best asked at the beginning of the program, while others must wait until the program has been completed. Begin by developing a list of potential evaluation questions. This is often done in a small group with other stakeholders. Then decide which questions are most important. Focus on the questions for which you need answers, not those on questions whose answers would be nice to know. The questions should be answerable and realistic given the resources available. Also consider evaluation questions that come from the logic model—questions that test the program logic or the assumptions underlying it. Also consider questions about implementation, effectiveness, efficiency, cost, or other aspects of the program. Section III of the TOR (page 90) presents the evaluation questions. It also includes a matrix that may be helpful in presenting the questions, why they are important and to whom, and your initial thoughts about what data are available and needed in order to answer the questions.

70 Choosing Evaluation Questions
There may be more evaluation questions that fall under the evaluation focus than can reasonably be answered. Consider the following when prioritizing: Can the question be answered based on available or attainable data? Is the question relevant to the project and evaluation focus? What stakeholder cares about this question? What resources would answering this question entail?

71 Types of Evaluation Questions
Session 2.5 Types of Evaluation Questions

72 Categories of Evaluation Questions
Descriptive Normative Cause/effect

73 Descriptive Questions
Descriptive questions focus on “what is” and provide a means to understand the present situation regarding processes, participants, stakeholder views, or environmental questions.

74 Descriptive Questions
Characteristics of descriptive questions: Have answers that provide insight into what is happening with program activities and implementation. Are straightforward, asking about who, what, where, when, and how. Can be used to describe inputs, activities, and outputs. May include gathering opinions or perceptions of clients or key stakeholders.

75 Examples: Descriptive Questions
What did participants learn from the program? Who benefited most (or least) from the program? How did the environment change during the years the program was implemented?

76 Normative Questions Normative questions compare program achievement with an established benchmark such as national standards or project targets.

77 Examples: Normative Questions
Did we spend as much as we had budgeted? Did we reach our goal of admitting 5,000 students per year? Did we vaccinate 80% of children as planned? Did we meet our objective of draining 100,000 hectares of land?

78 Cause-Effect Questions
Cause-effect questions intend to determine if change was achieved because of project activities. These questions require that the program (not something else) caused any change observed, so other causes have to be eliminated.

79 Examples: Cause-Effect Questions
Did the women’s empowerment program increase the income of female-headed households? Did malnutrition rate drop substantially (by at least 20%) among orphaned and vulnerable children targeted by the nutrition program? Did increased knowledge and skills in water harvesting techniques result in increased crop yield and income for the subsistence farmers?

80 Mixing Questions Depending on the evaluation purpose, a mix of different types of questions may be most appropriate for the evaluation. A single evaluation can include multiple question types; let the evaluation goal and resources available (i.e., in money, time, and human capacity) determine the mix. Once you have decided on the evaluation questions, add them to the TOR (Section III, page 90). After the evaluation has been focused by determining its purpose, after stakeholders have been engaged and specific evaluation questions developed, design of the evaluation effort can begin. Spending the necessary time up front to resolve these issues will make the rest of the process go much more smoothly.

81 Exercise: Brainstorming & Prioritizing Evaluation Questions
Think about your own program and following your evaluation purpose, complete the following: Brainstorm and identify a few key evaluation questions that could potentially be relevant to your program Based on your work on your organization’s evaluation purpose statement and reflecting on your organization’s context, identify key potentially relevant evaluation questions. Prioritize the questions you have identified use the Prioritizing Evaluation Questions template This exercise will take approximately minutes to complete.

82 Exercise: Prioritizing Evaluation Questions
Can this question be answered given the program? Which Stakeholder cares about this? How important is this? Does this involve new data collection? Can it be answered given your time and resources? Priority: High, Medium, Low, Eliminate If you are working with a group, please break out into your group. Spend approximately 30 minutes completing this activity. Be prepared to share your statement with the group.

83 TOR Exercise: Developing Evaluation Questions
If you are working with a group, please break out into your group. Based on the previous exercise, further develop your evaluation questions (and complete the accompanying information) using TOR III. Spend approximately 30 minutes completing this activity. Be prepared to share with the group.

84 Evaluation Design and Types
Session 3.1 This section requires the facilitator to present tangible examples. Examples should be tailored to the type of program participants are involved in. Evaluation Design and Types

85 Session Learning Objectives:
Compare and contrast qualitative, quantitative, and mixed approaches Identify common methods used in evaluations. Match the best method with different evaluation questions Identify ways to avoid common pitfalls in data collection To continue building your TOR: Complete the TOR Evaluation Design and Approach (TOR IV-A).

86 What is the Evaluation Design?
Plan for answering the key evaluation questions. Begin as soon as program planning begins. The evaluation design process should involve key stakeholders. Encompasses a strategy to evaluate the project Baseline Mid-term End-line Types of evaluation Evaluation plan/strategy

87 Evaluation Design Specifies:
Which people or units will be studies How they will be selected The kinds of comparisons that should be made By what approach the comparisons will be made The timing of the evaluation At what intervals the groups will be studied

88 USAID’s Evaluation Policy
“Recommitment to learn as we do, updating evaluation standards and practices to address contemporary needs” Why evaluate? For accountability – inform resource allocation To learn – inform and improve project design and implementation

89 USAID’s Evaluation Practices
Integrated into the design of each project – Relevant to future decisions Based on best methods – counterfactual/control Reinforcement of local capacity Commitment to transparency Dedicated resources Minimized bias

90 Key Concepts Control group – counterfactual Bias

91 Control groups a few examples …
Omega 3 fatty acids One year project to increase learning among school children aged 8 to 10 Measuring IQ at the beginning and at the end of the project. After the program, the children’s IQ increased with 15% Omega 3 worked really well, the project achieved its goals. What’s wrong with this kind of logic? Other things increased IQ (teachers, growing older, …) How can we fix this in our design?

92 Parachute use to prevent death and trauma related to gravitational challenge: a systematic review of RCTs Systematic review of literature Main outcome measure: Death or major trauma We don’t always need a control group or counterfactual Why don’t we need a control group?

93 Bias? *Evaluation design can help avoid bias* What is bias?
Different types of bias How can we avoid bias? *Evaluation design can help avoid bias* Systematic mistake/systematic error Can you give examples of bias?

94 Types of Evaluation Design
Quantitative Experimental Quasi-experimental Non-experimental Qualitative Mixed Methods Quantitative=numerical comparisons; qualitative=non-numerical explanations; mixed=both. Don’t confuse APPROACHES (these) with data collection METHODS (tomorrow)

95 Experimental Design Counterfactual – Control group
Studies units both receiving and not receiving the intervention Non-treatment is a counterfactual, or picture of what probably would have happened to the treatment group if they had not received the treatment Random Intervention allocation Intervention is randomly assigned, which means it is unlikely there are systematic differences (bias) between the groups Take measurements before and after intervention for both groups Enables attribution of change to the treatment

96 Experimental Design Challenges
Commonly used in drug trials or testing new types of treatment; only sometimes used in social science research or evaluation Randomization of assignment is usually not done during program implementation; intervention communities are targeted because they are either easiest to work with or most in need Ethical considerations usually limit viability of this approach in field settings Not providing program to those in need Answering a survey or otherwise participating in data collection can be burdensome Can be expensive

97 Experimental Design Sub-types
Pre-test and post-test control group design Post-test only control group design Treatment Control Intervention Measurement Measurement Can also have multiple measurements before and after intervention, other types of interventions in different combinations, etc. Treatment Control Measurement Intervention

98 Quasi-Experimental Design
What is it? What is the difference with an experimental design?

99 Quasi-Experimental Design
Compare groups that both receive and do not receive an intervention Groups are not randomly assigned Comparison groups are found that are as similar to the intervention group as possible EXCEPT that they do not receive the intervention Note that there are two characteristics of a comparison group: a) similar to treatment b) do not receive intervention. Consider both when finding a good comparison group.

100 Quasi-Experimental Design Challenges
How do you select your control group? What do you have to take into account when selecting a control group?

101 Quasi-Experimental Design
Commonly used in social science research Control group: Adjacent communities – random selection or matching General population Spill-over effects Ethical considerations: What do you do about problems you identify in the comparison sites? How do you minimize the burden of participating in data collection? Sometimes projects offer comparison sites an alternate benefit.

102 Quasi-Experimental Design
Select sub-types Non-equivalent control group pre-test post-test Non-random comparison before/after measurement There are a number of ways to choose non-random comparison groups and agreed upon methods for analysing them afterwards, but it’s beyond the scope of this training to go into detail Generic control design Comparison group is the general population Requires: Good data on general population from same time periods The general population to be similar on the treatment group

103 Threats to Quasi-Experimental Design
Consider whether the following may have influenced the comparison you are looking at: Program design shifts after baseline and comparison group receives treatment (or treatment group doesn’t) Another organization begins similar work in comparison areas Program target beneficiaries were chosen based on specific characteristics that can’t be replicated in a comparison (i.e. proximity to road) Program effects are wide-ranging, i.e. change to national policy What else can you think of that might influence whether or not your comparison group actually is a good comparison?

104 Non-Experimental Design
No comparison between groups Data are collected only from the group of individuals that received the intervention

105 Non-Experimental Design
Common characteristics of this design include: Allows documentation of status or change in beneficiaries but not attribution because there is no counterfactual Can still use statistical methods to make these data more descriptive: Measure your beneficiaries both before, during and after intervention Test any changes for statistical significance Easy to collect but not very persuasive

106 Interim Evaluation of Intervention
Change Made Change Made All examples would yield the same bar graph. Bar graph is what you typically get with an ‘audit’ or QA approach. Change no impact on slope of improvement. Could be due to increased drug supply chain of Artemisinin in March. Change no impact. Blip only in Sept and not sustained. Could be due to increased rotating staff. Change no impact on overall average. Change indeed made impact which was sustained. Change Made in June Change Made Change Made

107 Qualitative Approaches
Qualitative evaluation approaches synthesize people’s perceptions of a situation and its meaning Non-numerical data include: Open-ended interview Focus group discussion data Pictures Maps Case studies Qualitative approaches can still use in before/after, longitudinal, cross-sectional, or as post-test only designs

108 Qualitative Design Helpful when: Not good for:
You need to know why things have happened You don’t know exactly what results you are looking for You want the community to be engaged in the evaluation process You want descriptions of change Not good for: Rigorous counterfactuals Being able to quantify change that has taken place

109 Quantitative vs Qualitative
Qualitative Approaches Quantitative Approaches The aim is to identify common themes and patterns in how people think about and interpret something. The aim is to classify features, count them, compare, and construct statistical models that explain what is observed as precisely as possible. Evaluator may only know roughly in advance what he/she is looking for. Evaluator knows clearly in advance what he/she is looking for. Data are in the form of words, pictures, or objects. Data are in the form of numbers. Focuses on fewer selected cases, people, or events. Measures a limited range of responses from more people. Greater depth and detail is possible. Facilitates comparison and statistical aggregation for concise conclusions. Categories for analysis must be developed and are specific to the particular evaluation. Uses standardized measures that fit various opinions and experiences into predetermined categories, often using questions that have been verified and tested in other programs/studies. Can ask questions about the program holistically or about specific parts of a program. Views components of a program separately and uses data from the different pieces to describe the whole.

110 Mixed Approaches Best of both worlds
Combines qualitative and quantitative approaches When used together, these approaches can be stronger and give more meaningful results Qualitative data can give rise to a quantitative strategy once an evaluator knows what s/he is looking for Qualitative data can give meaning to confusing quantitative results Quantitative data can validate qualitative articulation of experiences

111 Choosing a Design Considerations in choosing an approach and design:
Purpose of evaluation Time, money, and other resources Data collected to date Stakeholder priorities Often a trade-off - compromise Can aggregate all of these into a design matrix

112 DATA COLLECTION METHOD
Design Matrix QUESTIONS SUB-QUESTIONS DATA COLLECTION METHOD DATA SOURCES UNIT OF ANALYSIS SAMPLING APPROACH COMMENTS A design matrix can help organize information relating to the program evaluation and is often included in the TOR (Section IV-B, page 92). Based on your study of chapters 1, 2, and 3, you should be able to complete the columns related to questions and sub-questions. You will be able to complete the columns related to data collection methods and data sources after Chapter 4, and unit of analysis and sampling approach will be addressed in Chapter 5.

113 What is a good basic design?
A good basic design maximizes the quality of the evaluation while also efficiently using resources. We can increase strength of the findings by minimizing threats to validity, or factors that might cause the audience to believe the evaluation is not accurate.

114 Threats to validity Threats to concept validity: measures things that aren’t relevant to the evaluation questions. Threats to internal validity: the data or findings are inaccurate. Threats to external validity: the results wouldn’t be the same elsewhere (relevant if you are interested in scaling the project up or expanding)

115 Good basic design Recognize that there are always trade offs; it will never be possible to do an experimental evaluation for every project Judiciously evaluate available resources to choose the approach and design that is most appropriate for your evaluation questions

116 TOR Exercise: Evaluation Design
If you are working with a group, please break out into your group. Develop a draft for TOR section IV A: Evaluation design and approach Spend approximately 30 minutes completing this activity. Be prepared to share with the larger group.

117 Data Sources and Collection Methods
Session 4 Data Sources and Collection Methods

118 Session Learning Objectives:
Compare and contrast different data collection methods. Select practical data collection methods for a project. Discuss ethical considerations in evaluations. To continue to build your TOR: Begin filling in the design matrix, specifically columns related to data collection methods and data sources (TOR IV-B). Identify and describe ethical considerations connected with the evaluation (TOR X).

119 Data Collection Data for evaluation comes from a wide range of sources and can be collected through an equally wide range of techniques Collection methods are not necessarily quantitative or qualitative; the same method often can be adapted to collect different kinds of data

120 Deciding on Data Collection Methods
Decision depends on: What you need to know Where the data reside Resources and time available Complexity of the data to be collected Frequency of data collection

121 General Rules for Collecting Data
If you collect original data: Establish procedures, document the process and follow it Maintain accurate records of definitions and coding of the data Pre-test data collection tools Follow recommended Standard Operating Procedures for data collection, collation, analysis and reporting (see Pact’s Data Quality Management Module for more) Next, we’ll look at some common methods for data collection

122 Existing Records Looking at data that already exist before analyzing it for project purposes Can be quantitative or qualitative If data were collected by a source other than the program, they are secondary data and you should ensure that the data are of good quality Can be very useful, but requires that data relevant to the project already exist and are accessible

123 Surveys Surveys have a single questionnaire that is administered to all respondents Questions can be open-ended, which generates qualitative data, or close-ended, which generates quantitative data Can be self-administered (cheaper) or administered by a data collector (more reliable) Can be a good source of aggregated and comparable data Difficult to get the questions right (use previously validated questions when possible)

124 Direct Measurements Possible when the outcome of interest can be directly observed. For example: Height or weight of a person Contents of soil or water Presence of disease Presence or absence of a policy Very reliable if done properly Can be intrusive, time-consuming, or require special equipment and expertise

125 Observation The evaluator directly observes activities and notes patterns, routines and other relevant data Useful when evaluation question can be answered this way (i.e. about service delivery) May be more objective than interviews with participants, but still very subjective Can be intrusive or influence behavior Can be direct or participatory

126 Key Informant Interviews
In-depth interviews, usually with a semi-structured open-ended questionnaire, with a handful of important stakeholders Helpful in getting the big picture of a project and understanding context Can ask some quantitative data, i.e. approximate village population, distance from road Which key informants are chosen can have a big influence on findings

127 Focus Group Discussions
Small-group (8-12 people) discussions facilitated by the evaluator, usually using a semi-structured open-ended questionnaire Participants are most often somewhat homogeneous in order to encourage people to feel comfortable expressing themselves (all same gender, age range, occupation) If well-facilitated, can encourage discussion that yields valuable information Usually several FGDs representing different populations are held to provide better representation of a community Time consuming to analyze data properly

128 FGD vs KII Factors to consider Use focus groups if
Use key informants if Group interaction Interaction may stimulate a richer and new responses Group interaction likely to be unproductive and limited Sensitivity of subject matter When subject is not sensitive causing participants to withhold their opinions When subject is sensitive and a respondent will be unwilling to discuss in a group Depth of individual responses Participants will be able to articulate their opinions faster to give others a chance The topic requires greater depth of response, or subject is highly specialised Extent of issues to be covered Themes to be interrogated are very few Greater volume of issues to be covered Availability of qualified staff Focus group facilitators need to be able to control and manage group Interviewer need to be supportive and skilled listener.

129 Most Significant Change
“Regular collection and participatory interpretation of “stories” about change” (Dart and Davies 2003) Beneficiaries give stories about the most significant change in their lives over a defined period of time to a panel of stakeholders, who identify the ones that best represent the project A larger group of stakeholders then meets to discuss the stories and their meanings Developed by Dart and Davies

130 Most Significant Change
What, in the last X period, was the most significant change regarding Y that happen to Z? The circles represent all the stories/responses people give in response to the question

131 Most Significant Change

132 Most Significant Change
The product is not the final story, but rather the recording at each level of how/why each story was chosen; feedback is key Many of these stories are included in the final document Good method of qualitative data collection that is systematic, participatory, and open-ended

133 Outcome Mapping Illustrates a program’s theory of change in a participatory manner Set in motion at a workshop before program activities begin Best suited to learn about outcome information of complex programs Requires a skilled facilator

134 Mapping Mapping is the visualization of key landmarks
Can show landscape change, spark discussion, or enable people to describe how their environments fit into their daily activities Can be hand-drawn or use accurate GIS data

135 Other Methods There are many, many methods.
Sites like BetterEvaluation.org can provide many explanations and examples of how a method can be used. If you are unsure which method, consult a professional evaluator for advice on options. Generally best to suggest basic frame of acceptable methods in the terms of reference, as the evaluation will yield the required type of data.

136 Consider the following when choosing which method to use:
Feasibility Do you have the resources (personnel, skills, equipment, and time)? Can the method fulfill the evaluation purpose and answer the evaluation questions? What are the language/literacy requirements? Appropriateness Does the method suit the project conditions and circumstances? Do all the stakeholders understand and agree on the method? Validity Will it provide accurate information? Is it possible to assess the desired indicator with accuracy? Reliability Will the method work whenever applied? Will the errors that occur be acceptable?

137 Consider the following when choosing which method to use:
Relevance Does the method produce information required or is it actually assessing another type of outcome? Does the method complement the basic approaches of the project, e.g., is it participatory? Sensitive Is it sufficient to assess variations in the different population characteristics, e.g. differences between age groups or gender? Can it be adapted to changing conditions without excess loss of reliability? Cost effective Will the method produce useful information at relatively low cost? Is there a more cost effective alternative method? Timeliness Does the method use staff time wisely? Will it require withdrawing staff from their usual activities, leaving project work unattended? Is there an acceptable level of delay between information collection, analysis, and use? Can the method be incorporated in other daily tasks?

138 Ethical Review All study methods should undergo ethical review. Consider the following ethical questions when thinking about your proposed design: Is the evaluation causing undue burden on a group receiving no program benefits if it includes a control, or raise expectations that a control group might receive services? Does the evaluation tool ask sensitive questions, for example regarding child abuse, insensitively or without resources to refer the respondent to appropriate resources? Would focus group discussions of sensitive topics result in negative consequences for the participants in their communities? Are children being asked questions without an adult present? Does the evaluation protocol collect informed consent?

139 Exercise: Evaluation Design
Round-Robin Table Conversations How can you have confidence that the outcomes you observe are the result of your program and not something else? How can you attribute the changes noticed in the intervention population (as opposed to non-intervention population) to your program support? How do you ensure that the problems identified prior to implementation are the real problems a successful project needs to address? How would you demonstrate that the responses you obtain from your beneficiaries are a true reflection of their perceptions or characteristics? What are the steps you might take to establish the cause and effect of the results achieved? Note 1- Working in small groups, select a host for each table. The host will stay at the table and record key points from the conversations. 2- Each group will have 10 minutes to discuss the question on the table. 3- When the whistle blows, individuals are to move to another table that has a different question. 4- Invite the recorders to share key points from the conversations for each question. 5- Encourage additional comments and questions

140 TOR Exercise: Design Matrix and Ethical Considerations
If you are working with a group, please break out into your group. In TOR IV-B, fill in the “data collection method” and “data sources” columns. In TOR X, complete the section related to ethical considerations. Spend approximately 60 minutes to complete these two activities. Be prepared to share with the larger group.

141 Session 4 Sampling

142 Session Learning Objectives:
Identify different units of analysis in evaluations. Compare and contrast probability and non-probability sampling. Identify potential biases in data collection. To continue to build your TOR: Complete the design matrix, specifically columns related to unit of analysis and sampling approach (TOR IV-B). Describe an appropriate sampling strategy (TOR IV-C). Notes : Start the presentation on Sampling by asking the question “Who here has worked a lot with sampling frames and sampling techniques and can help in supporting this discussion in keeping it true to the field setting?

143 Sampling Approaches Session 4.1
This section also requires the facilitator to draw on tangible examples- this will help the participants better understand the concepts. It is recommended that the participants read this section of the manual prior to the presentation. Sampling Approaches

144 Sampling Sampling is the answer to the question, “Who are we evaluating?”

145 Basic Concepts in Sampling
Unit of analysis: the person, group, place, or event of interest for the evaluation When might you have a unit of analysis that is not an individual?

146 Unit of Analysis Examples
EVALUATION QUESTION UNITS OF ANALYSIS Did the women’s empowerment program result in increased income levels among female-headed households? Female-headed households Did the rate of malnutrition drop substantially (by at least 20%) among orphaned and vulnerable children targeted by the nutrition program? Orphan and vulnerable children Did the increased knowledge and skills in water harvesting techniques increase crop yield and income for subsistence farmers? Subsistence farmers What impacts (positive or negative) did the intervention have on the wider community? Community Did the clinics that received the training implement what they learned? Clinics Did local governments adopt more transparent policies as a result of civil society organizations’ work? Local governments

147 Basic Concepts in Sampling
Study Population: the entire set of unit we are interested in For example: all intervention communities all low-income people in a city all people participating in an education program

148 Basic Concepts in Sampling
Sample: a subset of a study population Different from disaggregation Likely, you will not collect data on every program beneficiary for the evaluation. The ones you do collect data on are the sample. Good samples are representative

149 Sampling There are 2 broad categories of sampling techniques:
Probability sampling: every member of the population has an equal (or known weighted) chance of being selected for the sample Non-probability sampling: study participants are selected without this equal chance (some are more likely to be included in the sample than others)

150 Sampling Frames A sampling frame is necessary for true random sampling
This is a list of the entire population of interest If no such list exists or can be made, other techniques must be used to approximate randomization

151 Types of Probability Sampling
Simple random sampling Stratified sampling Cluster sampling Systematic sampling

152 Simple Random Sampling (SRS)
Every individual in the population has the same probability of being selected for the sample.

153 Sampling with a Non-Digital List
Example 1: Simple Random Sampling of Participants Take a piece of paper Write your Surname and initial on the piece of paper Then fold the piece of paper and put it in a box We’ll randomly select ¼ participants Note Collect the pieces of papers from each participant and put them in a box Give a good shake to mix the papers and ask one participant to draw one form the box with out looking Read out the name written on the paper Make a fun by saying that the person who was selected will lead the class on the next energizer and use the exercise to further explain SRS

154 Sampling with a Non-Digital List
You can also use a random number table

155 Sampling with a Digital List
Example 2: Simple random sampling of 10 households from a list of 40 households We have a list of 40 heads of households. Each has a unique number, 1 through 40. We want to select 10 households randomly from this list. Using the Excel function RANDBETWEEN(x,y), we select consecutive 2-digit numbers starting from the upper left. If a random number matches a household's number, that household is added to the list of selected households. After each random number is used, it is crossed out so that it is never used again. We continue to select households until we have 10. The MS Excel syntax is =RANDBETWEEN (x,y) X is the lowest number and Y is the highest number Example generating a random number between 10 and 100, type in =RANDBETWEEN(10,100) in a cell and then press Enter.

156 Challenges with Simple Random Sample
Simple random sample is the best way – so why are we not always using it? Can you think of challenges with simple random sampling?

157 Possible complications of SRS
If you don’t have a sampling frame, it can be costly to create one Sometimes – going to all program areas (e.g. Ethiopia) can be very expensive If there is a particular subset of the population of interest that is a relatively small group, (e.g. an ethnic minority) they may be left out of a random sample by chance if the sample is not very large We’ll be looking at different techniques that address each of these weaknesses in turn

158 Resources to help with SRS

159 Stratified Sampling First, divide the population of interest along a characteristic (or multiple characteristics) of interest. Examples: gender; age group; ethnic group Determine how many people you want to sample from each strata (group). Within each strata, proceed to select individual units as you would for a simple random sample.

160 Stratified Sampling Assures representation of key subgroups of the population If sub-populations of interest are small, stratified sampling with weighted over-sampling (choosing a higher percentage of respondents from the sample than is representative of the entire population) ensures that there will be enough data from that subgroup to get precise estimates of subgroup characteristics

161 Example of Stratified Sampling
Let’s say we have a population of 3 ethnic groups. We are very interested in how each of the 3 groups experienced the intervention, and making meaningful generalizations about how the intervention affected each group. The population breakdown is as follows: Ethnic group 1: 85% Ethnic group 2: 10% Ethnic group 3: 5% Though groups 2 and 3 are small, they are also marginalized and may have stood to gain the most from the project.

162 Example of Stratified Sampling
If we sample 100 people randomly, we’re likely to have very small numbers of people in Groups 2 and 3 (likely 10 and 5 people respectively. We cannot make any generalizations about a population from this small of a sample. If we want to do a simple random sample and be able to make within-group generalizations, we would have to increase the overall sample by several times. An alternative is to stratify the population by ethnic group and then over-sample from Groups 2 and 3. For example, you may choose to sample as follows: Group 1: 50 people Group 2: 25 people Group 3: 25 people This will give more precise data for group averages. If you also need whole-population statistics, you can weight each group’s data by the group’s actual size when analysing it.

163 Pros and Cons of Stratified Sampling
Stratified sampling is appropriate when you are very interested in data from particular subgroups and are concerned that simple random sampling will not yield those data. In some quasi-experimental methods, choosing samples of treatment and comparison units is a type of stratified sampling. Analyzing stratified data accurately requires that the percent of each subgroup in the population be known. As with simple random sampling, a sampling frame is necessary.

164 Examples and Illustrations of Stratified Sampling

165 If we don’t have a sampling frame?

166 Cluster Sampling Clusters are collections of study units that are grouped in some way. For example, a village is a cluster of beneficiaries; a classroom is a cluster of students. Cluster sampling means that the sampling happens at the Different levels. The first level is usually the cluster Cluster sampling can be more a more efficient way of collecting data. You can cut costs of traveling to every village by sampling villages

167 Cluster Sampling Assumptions
Clusters are more or less the same Study units within each cluster are heterogeneous Clusters do not need to be of equal size – if that is the case : use weighting techniques to compensate for the possibility of oversampling very small clusters. (e.g. selection with probability proportional to size)

168 Cluster Sampling There are two broad types of cluster sampling:
Single-stage cluster sampling when all units in the selected clusters are included in the sample. This makes sense when clustering by classroom, where it is most efficient to test all of the students in the sampled classes. Multi-stage cluster sampling when a sub-sample of units in selected clusters are chosen. This makes sense when clustering by village, where surveying every household in a village would be time-consuming and inefficient.

169 Cluster Sampling Cluster samples can be useful when:
No list of the population exists Well-defined clusters, which will often be geographic areas, exist It’s possible to estimate the population of each cluster Often the total sample size and total population must be fairly large to enable cluster sampling to be representative

170 Example of Cluster Sampling
List all villages/towns, with their populations; must have at least approximate idea of population Randomly select the desired number of villages from the list Within the village, systematically select households through a random walk method.

171 Pros and Cons of Cluster Sampling
Complete list of clusters is more likely to exist than complete list of units Can be less expensive to administer surveys by cluster Still need to have a strategy for identifying respondents within a cluster Need to have a relatively large population of interest and large sample size because there is some risk that within cluster respondents will be too homogenous to adequately represent the population

172 Systematic Sampling Systematic sampling is different from random sampling. This can be done without a sampling frame. Systematic sampling can still yield a representative sample if done properly.

173 Systematic Sampling First, determine what proportion of the population you need to sample. We will sample 10% of the population. Then, every nth unit that you come to is part of your sample. This is the sampling interval. For 10%, every tenth person is part of the sample. The first subject should be chosen at random. Because of this, we assume that a systematic sample has the same properties as a random sample.

174 Systematic Sampling Systematic sampling is useful when you do not have an actual sampling frame. Example 1: go to every 10th house in a village along a specified path to conduct a survey. Example 2: poll every 10th person who leaves a voting booth Can be used in combination with cluster sampling or stratified sampling.

175 Pros and Cons of Systematic Sampling
If the available data on population isn’t very good, systematic sampling can be more precise than random sampling. Systematic sampling will be designed differently based on context, so someone who understands the issues must be involved in design. Possible risk of periodicity, if for some reason the sampling interval coincides with every nth person having a special characteristic.

176 Non-Probability Sampling
Non-probability sampling is non-random sampling. Accidental Commonly referred to as convenience sampling. Purposive

177 Accidental Sampling Non-probability sampling strategy that uses the most easily accessible people to participate in a study. Not recommended. Very likely to be biased. The most easily accessible sample is likely to be similar in a confounding way. For example, many psychology studies use students in college psychology classes. When these studies are re-conducted with random participants, they yield very different results.

178 Purposive Sampling Non-probability sampling strategies that are deliberate Sampling done with some purpose in mind Two types: The respondents are specifically selected to represent “average” population The respondents are specifically selected to represent particular points of view, such as key informants

179 Purposive Sampling Makes sense when a population is not large and particular points of view are of interest (key informants) With a purposive sample, you are likely to overweight particular points of view either intentionally or due to accessibility Purposeful samples are NOT representative of populations

180 Categories of Purposive Sampling
Modal Instance Sampling: select the typical case Expert Sampling: select people with demonstrated expertise in the area of interest Quota sampling: select a certain number of respondents who fit particular characteristics Heterogeneity: select people who are supposed to represent different segments of the population Snowball: get referrals of people from previous respondents (for hard to reach populations, like illegal immigrants)

181 Choosing a Sampling Strategy
For probability samples, it is important to ask the following questions: What is the unit of analysis? Does the evaluation purpose, question, or method require a probability sample? What is the sampling frame? How will we obtain the complete list of the population? How will we ensure the list is accurate and up to date? What sampling technique will we use and why? What is the necessary sample size? How was the sample size calculated?

182 Choosing a Sampling Strategy
For nonprobability samples, it is important to ask the following questions: Will a nonprobability sampling technique allow us to fulfill the evaluation purpose and question? Are we using purposive sampling? If so, what are our inclusion criteria? How will we know we have reached saturation? How will we select your participants? How will we determine the sample size?

183 Session 4.2 Sampling Size

184 Sample Size How many people do you need in your sample?
Representativeness: small samples are more likely to be non-representative by chance Comparisons: if you are comparing populations, the total sample size will have to be larger Differences: if you expect small changes or differences, the sample size will have to be larger to be more precise Use a sample size calculator to figure out how many people you need

185 Sample Size Qualitative research Quantitative Research
No fixed rules on sample size Talk about “saturation”: continue interviewing until the data seem saturated and clear patterns begin to emerge May be informed by available resources, the topic of interest Quantitative Research Sample size formulae exist Based on the expected amount of change you expect to see, the level of confidence you want in the results, and population size

186 Why Calculate Sample Size?
Needlessly large samples may waste time, resources and money Small samples may lead to inaccurate results Allows study plan and budget Forces specification of expected results, which helps in later analysis and program consideration of results

187 Sample size calculation
A number of sample size calculators exist online:

188 Final Remarks This session was mainly to help you understand matters that will be important for budgeting, workplanning, and managing consultants Experts will understand the most appropriate kind of sampling technique for a given project Be sure to appropriately document sample selection in the evaluation plan and report

189 Final Remarks This session was mainly to help you understand matters that will be important for budgeting, workplanning, and managing consultants Experts will understand the most appropriate kind of sampling technique for a given project Be sure to appropriately document sample selection in the evaluation plan and report

190 Exercise: Choosing an Appropriate Sampling Approach
Read through the case scenario “Sampling at XX Child Care Centre” Identify one data collection method that you would use to answer the evaluation question, “In what ways have the supervisors and managers used their learning from the workshop?” Recommend two sampling methods (one probability and one non-probability) and identify the advantages and disadvantages of each methods. Use the table on the handout to organize your thoughts. Note Debrief the activity with the following questions 1- What did you learn about sampling form this activity 2- When conducting an evaluation, what are the implications of obtaining a low response rate form a sample? 3- What could you do to prevent obtaining a low response rate (in the context of the case or from current practice)? 4- Reflecting on your past evaluation experiences, what problem have you observed in how samples were determined or selected?

191 TOR Exercise: Sampling
If you are working with a group, please break out into your group. Complete the “unit of analysis” and sampling approach columns in the Design Matrix TOR IV-B. Develop a draft for section IV-C: Sampling Strategy Spend approximately 45 minutes to complete these two activities. Be prepared to share with the larger group.

192 Session 6 Basic Data Analysis

193 Session Learning Objectives:
Describe the basics of data analysis. Prepare data for analysis. Interpret the evaluation data and findings. To continue building your TOR: Devise data analysis procedures, including a data analysis plan and dummy tables (TOR IV-D).

194 Data Analysis Data analysis is the process of turning raw data (numbers or text) into usable information.

195 Developing a Data Analysis Plan
A plan should be developed before data are ever collected. Raw data Analysis Useful information Give the example of Alison’s presentation at AEA last year (from Namibia?)

196 Why Create a Data Analysis Plan?
Makes sure you don’t collect unneeded data Makes sure you do collect necessary data Figures out the kinds of results you are interested in and helps articulate the hypothesis Enables other parts of data collection, such as sample size selection Ensures that the most appropriate methods to get the necessary information are used A data analysis plan should be completed at the TOR or inception report stage.

197 What Does a Data Analysis Plan Include?
The key variables for each evaluation question The type of analysis that will be performed for each type of variable What kinds of comparisons will be made How data will be presented (e.g. graphs, tables, quotes) Data analysis plans should be based on the evaluation questions developed at an earlier stage.

198 Dummy Tables Dummy tables are mock tables created before data analysis. They should be the tables we eventually want to put in the final report, but without the actual data. This technique enables us to better think through the key results that we will actually be showing in the report. This technique also helps us think through our analysis beforehand and the best way to visualize it.

199 Dummy Tables: Descriptive Statistics
Variable Descriptive Statistics Age Mean (Standard Deviation) Percent Female % Female Education No Schooling Primary School Middle School College % No schooling % Primary School % Middle School % College

200 Dummy Tables: Statistical Significance
Intervention Non Intervention Statistical Significance Condom use pre-intervention  %  p=0.x Condom use post-intervention % P=0.x

201 Dummy tables: qualitative analysis
Themes Key Informants in Town A Key Informants in Town B Government Corruption Hopelessness Environmental Degradation Economic Uncertainty

202 Data Preparation After the data have been collected in line with the data plan, the data must be prepared before being analyzed: Data Cleaning Data Coding Organizing data into a dataset

203 Data Cleaning Data is cleaned to ensure that “bad data” is excluded from the analysis in order to avoid drawing invalid conclusions Sometimes bad data is so obvious and can quickly be identified in the data set: A six-month-old baby that weighs 50kg or Age of a participant in a focus groups being indicated as 2 years

204 Data Cleaning One method of cleaning data is to take a random selection of data entries and compare it to the source to check for transcription errors If many errors are found, a more systematic comparison is made Usually, data are double-entered by two different people and can be quickly checked against each other for discrepancies

205 Data Coding Coding is assigning a tag or identifier to responses
Should be done after the data is cleaned This makes it easier to work with computer software to undertake the analysis. For quantitative data, numerical codes are assigned to key variables, i.e.: 1=Male and 2=Female For qualitative data, concepts that repeat can be grouped together or coded with tags or numbers All data codes should be recorded on a master codebook to ensure accuracy later in data analysis

206 Datasets Data are organized into a dataset, which are an organized matrix of columns and rows. Rows: each individual respondent Columns: Each variable Each individual should have a unique ID number ID Number Name Gender Province 001 Vincent 1 4 002 Tsakani 2 3 003 Khensani 5

207 Types of Quantitative Analysis
Descriptive measures: proportions, frequencies, rates, and ratios Measures of central tendency: averages Measures of dispersion: how widely data are spread

208 Descriptive Measures Descriptive Measures Example Proportion
Proportion Number of observations with a given characteristic divided by the total number of observations 1 out of 3 children in the study had a Vitamin A deficiency; 56% of participants completed the training Frequency Arrangement of values from lowest to highest with a count of the number of observations sharing each value; these counts are often converted into a percentage of the total count. 12 participants (40%) had attended school for less than 5 years, 12 participants (40%) attended school for between 5 and 8 years, and 6 participants (20%) graduated from high school. Rate Occurrences per a certain constant over a certain time period. The infant mortality rate is the number of deaths of infants under one year old per 1,000 live births. Ratio Number of observations in a given group with the characteristic, divided by the number of observations in the same group without the characteristic. 81 women were married and 27 were not married. The ratio of married women to non-married women was 3:1.

209 Measures of Central Tendency
Mean The average. This is calculated by totaling the values of all observations and dividing by the number of observations. Participants were ages 18, 18, 20, 21, and 26. The average age of participants was 20.6. Median The middle observation, i.e., half the observations are smaller and half are larger. This is calculated by arranging the observations from lowest to highest (or from highest to lowest), counting to the middle value, then taking the middle value for an odd number of observations and the mean of the two middle values for an even number of observations. Participants were ages 18, 18, 20, 21, and 26. The median age of participants was 20. Mode The value in the set that occurs most frequently. Participants were ages 18, 18, 20, 21, and 26. The mode was 18.

210 Measures of Dispersion
Measure of Dispersion Range The difference between the largest observation and the smallest. This is often expressed as the largest and smallest observation rather than the difference between them. Participants were ages 18, 18, 20, 21, and 26. The ages of participants ranged from 18 to 26. Standard deviation This is a measure of the spread of data around the mean, or in other words, the average of how far the numbers are from the mean. If the standard deviation is 0, then all the observations are the same. Participants were ages 18, 18, 20, 21, and 26. The standard deviation is 2.9.

211 Presenting Quantitative Data
Just embedding text in a report makes it difficult to see the big picture Data can be visualized in a number of ways; here, we will go through just a few of the most common

212 Bar Graphs A bar graph is used to show relationships between groups.
The items being compared do not need to affect each other. Bar graphs are a good way to show big differences.

213 Line Graphs A line graph is used to show continuous data.
It's easy to see trends by the rises and falls on a line graph. Line graphs are useful in depicting the course of events over time.

214 Pie Charts Pie charts are used to show how a part of something relates to the whole. This kind of graph is an effective way to show percentages.

215 Tables Tables can be used to present absolute numbers or percentages.
Tables are very useful for providing detailed statistical data that may be too complex to be captured by a simple chart. Ages 18-29 Ages 30-39 Male 24 35 Female 36 32

216 Example of quantitative data analysis and presentation

217 Exercise: Quantitative Data Analysis
Form groups of four to five people Each group should have at least one laptop Receive spreadsheet with program data in Microsoft Excel Analyze data within the group Discuss useful information and possible conclusions

218 Qualitative Data Prepare data Converted into written form
Interviews and FGDs should be transcribed verbatim Checked against the recording Translation may be necessary Documents and notes should be checked for accuracy Some data analysis will require software programs such as Nvivo

219 Qualitative Data Code and Analyze
Goal is to identify themes and patterns; must be done systematically, although there are different approaches How is this typically done?

220 Qualitative Data Common steps include:
Develop a list of expected codes (based on evaluation questions, KII guides, etc.) Read all of the transcripts; revise codes Develop a codebook– ensure agreement between coders Read the transcripts, identifying and marking sections with the predetermined codes If there are multiple coders, compare coding Review the coded data and combine into set of themes (usually 5-7)

221 Qualitative Data Present data Often in narrative description
Quotes are often included Images can be included Tables are used to summarize findings Matrices can also be used to compare themes/groups Diagrams used to depict concepts, show a process, etc.

222 TOR Exercise: Data Analysis Procedures
If you are working with a group, please break out into your group. Complete IV-D of the TOR to describe the data analysis plan and the steps and available tools you plan to use in both qualitative and quantitative data analysis Spend approximately 45 minutes to complete this activity. Be prepared to share with the larger group.

223 Using and Communicating Results
Session 7 Using and Communicating Results

224 Session Learning Objectives:
Maximize the use of the evaluation findings. Write or oversee the writing of an evaluation report. Develop a plan to share the results TOR: Describe the layout and content of the evaluation report (TOR VII-A) Develop a dissemination plan to share evlauation findings with stakeholders (TOR VII-B)

225 Session 7.1 Evaluation Report

226 The Evaluation Report The evaluation report is the key product of the evaluation process. Its purpose is to provide : A transparent basis for accountability for results Recommendations for decision-making on policies and programs A summary of information gathered to inform learning and improvement.

227 Evaluation Report Components
A title page. A list of acronyms and abbreviations. A table of contents, including a list of annexes. An executive summary (ES). An introduction describing the program’s background and context. A description of the program and the logic model. A statement of the purpose of the evaluation.

228 Evaluation Report Components
Key questions and a statement of the scope of the evaluation, with information on limitations and delimitations. An overview of the evaluation approach and methodology. Data sources. Findings. A summary and explanation of findings and interpretations. Conclusions. Recommendations. Lessons, generalizations, alternatives. Appendices, also known as annexes, including a special methods annex.

229 Characteristics of a Good Evaluation Report
Review checklist on page 67-69:

230 Dissemination of Evaluation Results
Session 7.2 Dissemination of Evaluation Results

231 What informs a communication strategy?
Who needs to know or might use the results and lessons What they might want to know How to reach our audience When is the best time or opportunity to reach the audience How much budget, time, personnel and other resources are needed

232 Dissemination plan matrix
Using a table like the one below can be helpful in figuring out what to share with who, when, and how. Stakeholder Key Findings Channel of Communication Product to Share Donor Quality of service Sustainability Dissemination Meeting Abstract Power Point Slides While you may take all of your stakeholders into account, you may determine that just a subset of them are most important to prioritize during the evaluation dissemination.

233 Channels of Communication
Besides publishing a report, keep in mind the following ways of reaching stakeholders: Written reports and summaries Workshops Publications (e.g. journals, newsletters, etc.); Participatory methods (e.g. community meetings, discussions) Mass media (radio, TV, newspapers or press releases) Interactive media (websites, social media, etc.) Research or professional meetings Political meetings

234 TOR Exercise: Reporting and Dissemination Plan
If you are working with a group, please break out into your group. Complete VII-A and VII-B  of the TOR Spend approximately 45 minutes to complete these sections. Be prepared to share with the larger group.

235 Managing the Evaluation
Session 8 Managing the Evaluation

236 Session Learning Objectives:
Prepare for evaluations. Budget for an evaluation. Select an evaluation team. Develop Calls for Expressions of Interest and TOR Manage evaluation consultant. To continue building your TOR: Describe the evaluation team’s roles and responsibilities (TOR V). Describe the schedule and logistics (TOR VI), and provide a timeline (TOR IX). Develop an evaluation budget (TOR VIII).

237 Pre-evaluation planning
Session 8.1 Pre-evaluation planning

238 Pre-Evaluation Planning
Whether evaluations will be internal or external, it’s important to have team documentation Pre-planning ensures efficient use of resources and adequate addressing of evaluation questions. CRS and American Red Cross have developed a comprehensive guide to pre-evaluation planning:

239 Steps in Pre-Evaluation Planning
1. Identify, empower, and mentor the evaluation manager: Internal staff member who will be managing the consultant, usually the M&E head or someone from program management. Clarifies the reporting chain 2. Clarify relevant guidelines and requirements: Performance Management Plan/ MER Plan Donor guidance The project description Internal evaluation policy

240 Steps in Pre-Evaluation Planning
3. Prepare the evaluation scope of work or terms of reference and workplan 4. Identify the evaluation team 5. Organize project documentation into a bibliography (See Page 75 of manual) 6. Organize project information (briefing book) 7. Plan evaluation logistics

241 Budgeting for Evaluation
Session 8.2 Budgeting for Evaluation

242 Key Budget Considerations
What funds are available in the either the project or organizational budget? Who will fund the costs? Are there restrictions on the use of funds such as deadlines or donor-specified scopes of work? Will the evaluation be outsourced? If so, what costs would be associated with outsourcing? (advertising, etc.) What internal resources (staff time, facility costs) would be required? What costs and resources will be associated with the activities in the evaluation plan? Brainstorm

243 Barriers to Good Budgeting
No initial budgeting because evaluations were low priority or forgotten Lack of budgeting skills in staff Limited funding for evaluation activities M&E staff not involved in budgeting process M &E activities are usually 5-10% of overall project budgets, sometimes including multiple evaluations.

244 Session 8.3 Evaluation Team

245 External Evaluation Teams
External evaluation teams are solicited through expressions of interest. Expressions of interest typically mirror terms of reference developed for an evaluation, though they also include guidelines on the application process. Following the selection of a consultant, the project writes a selection memo summarizing the selection process.

246 Managing Consultants Though consultants are independent, they must still be managed in order to ensure that the evaluation products are adequate to project needs. Managing consultants includes: Briefing the consultant Reviewing and approving Inception Report ( feedback on data collection tools and detailed plan/methodology) Checking on data collection Reviewing and approving drafts and final reports Disseminating results

247 Final words Planning is one of the most important parts of evaluation
Be sure to link activities to evaluation questions, and that evaluation questions are suited to project needs Communicating and acting on results is important. A high quality evaluation report is useless if the project does nothing with the findings and recommendations.

248 TOR Exercise: Evaluation Team Roles
If you are working with a group, please break out into your group. Complete TOR V to describe the evaluation team’s roles and responsibilities Spend approximately 30 minutes to complete this activity. Be prepared to share with the larger group.

249 TOR Exercise: Schedule, Logistics, and Timeline
If you are working with a group, please break out into your group. Complete the TOR sections to describe the schedule and logistics (TOR VI); and develop a timeline (TOR IX). Approximately 45 minutes to complete these sections. Be prepared to share with the larger group.

250 TOR Exercise: Budget If you are working with a group, please break out into your group. Complete TOR III related to developing an evaluation budget. Approximately 30 minutes to complete this exercise. Be prepared to share with the larger group.

251 TOR Exercise: Finalizing your TOR
If you are working with a group, please break out into your group. Review and refine your TOR If time permits, prepare to share your full TOR with the workshop participants for additional review/critique Remember, the TOR is an iterative process!

252 The End


Download ppt "Evaluation for M&E and Program Managers"

Similar presentations


Ads by Google