Sampling Strategies and Saturation [insert date]

Sampling Strategies and Saturation [insert date]
Qualitative Methods in Evaluation of Public Health Programs Session 5

The evaluation process
[Instructions: Mention that this session continues with the design of the evaluation, focusing specifically on sampling. The issues of rigor that we mentioned earlier in qualitative evaluation is key.]

Learning objectives By the end of this session, participants will be able to: Identify types of sampling strategies employed in qualitative evaluation Explain the concept of data saturation and how to identify it Recognize considerations that have an impact on the sampling strategy(ies) Discuss strategies to reduce bias in sampling [Instructions: Read session objectives from slide. Link learning objectives with topics and activities in this session.]

Qualitative sampling approaches
[Instructions: Mention that by way of overview we can say that the sampling strategies for qualitative evaluation typically focus on small numbers, or in some cases single individuals, which are selected with a purpose. And the design of a qualitative evaluation should explicitly capture the purpose. So, why you select participant B from an entire population should be explicitly stated.] The logic of purposeful sampling lies in selecting information-rich cases. So, assuming per our group work we have a team of experts in positivists, constructivists, and emancipatory paradigms in research. If I need an experts’ opinion about emancipatory paradigm who should I talk to? The purpose for which the participants are being selected should always be made clear. For example, if the purpose of an intervention which is being evaluated was to increase the effectiveness of a program in reaching lower-socioeconomic groups, one may learn a great deal more by focusing in depth on understanding the needs, interests, and incentives of a small number of carefully selected poor families than by gathering standardized information from a large, statistically representative sample of the whole program. The purpose of purposeful sampling is to select information-rich cases whose study will illuminate the questions under study. Purpose is the key to rigor in the design of qualitative evaluation. That we will select a small number, but with a purpose. More is not necessarily better. Qualitative sampling approaches “Qualitative inquiry typically focuses on depth in relatively small samples, even single cases, selected purposefully.” Purpose should always be made clear “The logic and power of purposeful sampling lies in selecting information-rich cases for study in-depth.” (Patton, 1990) For example…

Examples of purposeful sampling
Summarize participants’ answers and mention that there are several but we will focus on these common ones on the next slide.] [Instructions: Ask participants to brainstorm types of sampling strategies or techniques in qualitative research. Examples of purposeful sampling Strategies for qualitative evaluation

Purposeful sampling strategies
*Extreme or deviant case sampling Intensity sampling *Maximum variation sampling/ heterogeneous Homogeneous sampling Typical case sampling Critical case sampling *Snowball sampling Criterion sampling Theory-based sampling Confirming and disconfirming sampling *Stratified purposeful sampling Opportunistic/ emergent sampling Convenience sampling Combination or mixed purposeful sampling There are many qualitative sampling approaches proposed by Patton (1990) and others, however, we will only focus on four of these which are commonly used. We do not have time to get into others, but you can investigate them outside of this training. [Instructions: You can ask participants to share other types of sampling methods they know and discuss before explaining these selected ones. If participants did not mention any on this slide, ask if anyone knows what these are.]

Maximum variation sampling
Selecting cases that maximize diversity in ways relevant to the evaluation question (i.e., principal outcomes that cut across a great deal of participants or program variation). Maximum variation sampling: This strategy for purposeful sampling aims at capturing and describing the central themes or principal outcomes that cut across a great deal of participant or program variation. For small samples, a great deal of heterogeneity can be a problem because individual cases are so different from each other. The maximum variation sampling strategy turns that apparent weakness into a strength by applying the following logic: Any common patterns that emerge from great variation are of particular interest and value in capturing the core experiences and central, shared aspects or impacts of a program. For example, maternal and child health care services (pregnant women, women within a specific trimester, first-time mothers, mothers who have just delivered, attending postnatal, etc.).

Example A statewide maternal nutrition program has project sites in rural areas, urban areas, and suburban areas With maximum variation sampling (MVS) the evaluator can at least be sure that the geographical variation among sites is represented This will be in addition to sampling different categories of women who access maternal services (age, educational background, those who had different experiences, etc.)

How it works Begin by identifying diverse characteristics: Setting, age, experience, educational background, sex, etc. For small samples a great deal of heterogeneity might be problematic, but MVS turns that into a strength Useful for documenting uniqueness and patterns identified How does one maximize variation in a small sample? One begins by identifying diverse characteristics or criteria for constructing the sample. Suppose a statewide program has project sites spread around the state, some in rural areas, some in urban areas, and some in suburban areas. The evaluation lacks sufficient resources to randomly select enough project sites to generalize across the state. The evaluator can at least be sure that the geographical variation among sites is represented in the study. The same strategy can be used within a single program in selecting individuals for study. By including in the sample individuals the evaluator determines have had quite different experiences, it is possible to more thoroughly describe the variation in the group and to understand variations in experiences—while also investigating core elements and shared outcomes. The evaluator using a maximum variation sampling strategy would not be attempting to generalize findings to all people or all groups, but instead would be looking for information that elucidates programmatic variation and significant common patterns within that variation.

What is stratified purposive sampling?
It can be described as samples within samples and suggests that purposeful samples be stratified or nested by selecting units or cases that vary according to a key dimension Sex, location, etc. A stratified purposive sample captures major variations, although these may also emerge in the analysis Each of the strata would constitute a fairly homogeneous sample Stratified purposeful sampling: This is less than a full maximum variation sample. The purpose of a stratified purposeful sample is to capture major variations rather than to identify a common core, although the latter may also emerge in the analysis. Each of the strata, or groupings, would constitute a fairly homogeneous sample. Source: Patton, 2001

Stratified purposive sampling
For example, purposefully sample primary health care centers and stratify purposeful sample by health center size and center setting This strategy differs from other types of stratified sampling, why? For example, you may purposefully sample primary care health centers and stratify this purposeful sample by health center size (small, medium, and large) and center setting (urban, suburban, and rural). If you have enough information to identify characteristics that may influence the issue of interest, then it may make sense to use a stratified purposeful sampling approach. [Explain that this strategy differs from stratified random sampling because the sample sizes are likely to be too small for generalization to a larger population or statistical representativeness.] Sample

Extreme or deviant case sampling
What is it? A form of purposive sampling in which you select an “outlier” case, “exception to the rule,” or one that displays negative characteristics Choosing extreme cases after knowing the typical case or average case Looking for a sample that will challenge the current understanding of the subject [Mention that extreme or deviant case sampling is also sometimes referred to as negative case sampling.] For example, the evaluation of adherence to a medication regime. It is observed that most patients on average do not adhere. However, it is also observed that there are a few that adhere no matter what. In this case, you might be interested in cases that adhere no matter what. How exactly can you explain an outlier case? You can only identify the outlier when you have been able to identify typical cases.

Extreme or deviant case sampling
Example In many instances, one is elaborating on a subject, purposefully seeking exceptions, or looking for variation; to refute a pattern is as important as looking for supportive data For example, an intervention assists families with resettlement after a natural disaster. Preliminary evidence suggests all who received intervention resettled—you purposefully seek those who received intervention but are not resettled. How will such information be useful? What evaluation paradigm is likely to employ this tactic? In fact, qualitative researchers actively look for "negative cases" to support their arguments. A "negative case" is one in which respondents' experiences or viewpoints differ from the main body of evidence. When a negative case can be explained, the general explanation for the "typical" case is strengthened. It helps us to focus on marginalized and vulnerable populations who usually do not behave like an average case or typical case. Speaking to such a negative case who received intervention but did not resettle may give key information on other factors that determine resettlement. For instance, if the resettlement package included financial assistance—but other factors to consider may be differences in coping strategies, family histories, and resilience, among others. [Question for 2–3 minute discussion]: Thinking back to the paradigms we discussed in the first session, do you see this approach as positivist, constructivist, emancipatory, or pragmatist? And why? [Pragmatist or constructivist is the best fitting answer.]

Snowball sampling Involves asking people who have already been interviewed to identify others who fit the selection criteria Useful for dispersed or hard-to-reach populations E.g., men who have sex with men and intravenous drug users Useful where key criteria might not be widely disclosed or are too sensitive for a screening interview E.g., assessing how a peer counseling intervention program has improved the lives of drug users in a community Note: Confidentiality and ethical concerns especially important Snowballing or chain sampling: These terms are used for an approach which involves asking people who have already been interviewed to identify other people they know who fit the selection criteria. It is a particularly useful approach for dispersed and small populations, and where the key selection criteria are characteristics which might not be widely disclosed by individuals or which are too sensitive for a screening interview (for example, sexual orientation). This is an approach for locating information-rich key informants or critical cases. The process begins by asking well-situated people: "Who knows a lot about phenomenon being evaluated?” You then ask them: “Who should I talk to?" Current participants name others with specific characteristics. For instance, you can ask current evaluation participants on a substance abuse prevention program delivered in a community to provide the names of other people who are drug addicts to see if the program has changed anything in their lives. Other examples: Men who have sex with men; victims of gender-based violence; women who have experienced miscarriage; etc. When using snowball sampling to work with sensitive topics or populations, evaluators must take extra precautions to maintain confidentiality and pay attention to potential harm to participants and interviewers. The ethics and gender sessions will address this further.

Snowball sampling Evaluator has 3 contacts Evaluator
Person 1 Friend/contact 1 contacts his/her own friends/contacts Friend/contact 2 contacts his/her own friends/contacts Friend/contact 3 contacts his/her own friends/contacts 4 5 6 7 8 9 10 11 Evaluator Evaluator has 3 contacts Each of the 3 contacts has 3 contacts The evaluator is on top. The evaluator has three contacts, who lead the evaluator to 9 other participants with the key criteria. However, this diagram implies that participants who are suggested to diverge initially eventually converge. By asking a number of people who else to talk with, the snowball gets bigger and bigger as you accumulate new information-rich cases. In most programs or systems, a few key names or incidents are mentioned repeatedly. Those people or events recommended as valuable by a number of different informants take on special importance. The chain of recommended informants will typically diverge initially as many possible sources are recommended, then converge as a few key names get mentioned over and over. Why? And, how can this be addressed?

Danger of compromising diversity
Because the additional sample is generated through existing participants, it is likely to have similar characteristics This can be mitigated if: Required characteristics for the new sample are clearly defined Various sampling approaches are used to supplement one another Participants identify people who meet the criteria, but are dissimilar to themselves Family members or close friends are avoided Those identified by existing sample members are treated as links However, because new sample members are generated through existing ones, there is clearly a danger that the diversity of the sample frame is compromised. This can be mitigated to some extent, for example, by specifying the required characteristics of new sample members, by asking participants to identify people who meet the criteria but who are dissimilar to them in particular ways, and by avoiding family members or close friends. An alternative approach would be to treat those identified by existing sample members as link people—not interviewing them but asking them to identify another person who meets the criteria. Although this is more cumbersome, it creates some distance between sample members.

Think, pair & share: Activity
Read sampling scenarios… 10 minutes: For each scenario, identify the most appropriate sampling approach Give an explanation of why you would use that sampling approach Discuss your answer with your neighbor 5–10 minutes: Share with the group [Tell participants to get out handout for this exercise from the Participants’ Guide.] This activity is 20 minutes. Let participants read each scenario presented and then identify a specific sampling approach that is appropriate for that particular situation and give reasons for selecting that option. This is first done individually, then discussed with a neighbor. Then, answers are shared with the group for 5–10 minutes.]

Answers Scenario 1: Snowball sampling
Scenario 2: Maximum variation sampling Scenario 3: Stratified purposive Scenario 4: Stratified purposive Scenario 5: Extreme/deviant/negative case sampling [Note to facilitator: There are descriptions in the Facilitators’ Guide on why each of these is the correct answer.]

Summary of sampling approaches
Stratified purposive sampling: Illustrates characteristics of particular subgroups of interest; facilitates comparisons Extreme/deviant case sampling: Disconfirming cases (i.e., those who are “exceptions to the rule”) Snowball sampling: Identifies cases of interest from people who know people that know what cases are information rich Maximum variation sampling: Documents unique or diverse variations that have emerged in adapting to different conditions [Facilitator to summarize the various sampling techniques presented in the previous slides.]

Data saturation and sample size
[2–3 minutes: Ask participants about their understanding of data saturation. How have they applied data saturation in the past? Was it useful or not? Discuss their answers and then proceed with slide presentation.]

Data saturation What is it?
Data saturation means that no new themes, findings, concepts, or problems relevant for the study emerge through the data collection process Data saturation means that no new themes, findings, concepts, or problems are evident in the data. Researchers commonly seek to collect data to explain a phenomenon of interest. Hence, an evaluator looks at this as the point at which no more data needs to be collected. When the theory appears to be robust, with no gaps or unexplained phenomena, saturation has been achieved. [Also, explain that this is usually referred to as descriptive saturation]: The researcher finds that no new descriptive codes, categories, or themes are emerging from the analysis of data (Rebar et al., 2011). So for the descriptive, you can tell that you have several codes flowing from the source yet there are no connections between them. Therefore, you have enough to describe a concept, phenomenon, or practice.

Saturation Intended for grounded theory approach, in which the theoretical model being developed stabilizes Studies not adopting grounded theory approach use the term broadly: The point in data collection and analysis when new information produces little or no change to the code book (Guest et. al., 2016) Saturation is commonly used as the criterion to determine when sampling should cease in qualitative evaluation So, whether you are adopting a snowball sampling technique, maximum variation sampling, negative case sampling, or stratified purposive sampling, we say that the actual sample size may be determined through the application of the concept of saturation.

How and when do you determine saturation?

Determining data saturation
Determined through constant comparison of data during or after data collection. Specify a priori at what sample size the first round of analysis will be completed. Specify a priori how many more interviews will be conducted, without new shared themes or ideas emerging, before the evaluation team can conclude that data saturation has been achieved. The analysis would ideally be conducted by at least two independent coders and agreement levels reported to establish that the analysis is robust and reliable. It should not be a solo decision. According to Glaser & Strauss (1999), the decision that data saturation or data redundancy has been reached should be facilitated through constant comparison of data (Glaser & Strauss, 1967; Glaser, 1999). This can be done during data collection or after. The evaluator can move back and forth between the data and emerging tentative thematic identification and interpretation. In this process, they will witness reoccurring patterns and themes in the data’ (Cutcliffe & McKenna, 2002). Consequently, this constant comparison of data can be contingent upon concurrent data analysis and collection (Rose & Webb, 1998). You can also specify before the data collection process when the evaluator will begin with first round of analysis, indicating that this process will be useful for determining saturation. But, is also key when stated a priori to indicate how many more interviews will be conducted when saturation is identified.

Practical ways to determine saturation
Proportion of identified themes at a given point in analysis divided by the total number of themes identified 6 interviews (70% saturation) 12 interviews (92% saturation) Interviewing (Guest, et. al., 2006) Point after conducting 10 interviews, when 3 additional interviews yield no new themes Most themes identified within 5–6 interviews Saturation reached within 17 interviews in one study (Francis, et. al., 2010) Guest et al., 2006; and Francis et al., 2010, “Summary of Saturation Findings from Empirical Studies. For Interviewing,” used findings from their studies to indicate how to suggest some practical ways for determining saturation, as well as the estimated number of interviews to conduct. Guest and colleagues empirically determined saturation by calculating the proportion of identified themes at the given point in analysis divided by the total number of themes. With this method, 70% saturation was reached with six interviews, and 92% saturation was reached with 12 interviews. Francis and his colleagues determined saturation by indicating the point, after 10 interviews, when additional interviews yielded no new themes.

This figure, from Guest, et al., 2006, indicates that the number of interviews and the level of saturation reached mimics the figure previously shown that explains saturation.

Deductive approach (8 FGDs to reach saturation) Inductive approach (5 FGDs to reach saturation) FGDs (Conen, et. al, 2012) 80% themes discovered within 2–3 FGDs 90% themes discovered within 3–6 FGDs (Guest, et. al., 2016) We can again learn from Guest and other evaluators and researchers. For focus group discussions (FGDs), using a similar approach for the interviews, Guest et al., 2016, determined that 80% of themes were discovered within 2–3 FGD’s—and also 90% discovered within 3–6 FGDs. Conen, et al., looked at the approach used (inductive and deductive) and proposed that deductive approaches may use eight FGDs, whereas inductive approaches may use five FGDs. Note that these figures and percentages are only tentative. Most authors claim that there are no rules in determining the size of a sample, and indeed this is true. However, throughout experiences in selecting samples and with the observations of other authors and colleagues, there are certain recommendations that can be useful when building a sample. In order to carry out a qualitative assessment, we must do the following: Find the “key” typologies or “characteristics” of the users to understand the success or lack of success of the program Cross check these variables Create a table where all variables with the these characteristics may be included FGD: focus group discussion

Data saturation What it means for sample size planning
There is no formula for determining the “correct” sample size. For funded projects, and for ethical reviews, you still need to state sample size. But how? You can still approximate using learning from other studies in the area of study. For example, studies proposing sample sizes for FGDs. Project what you anticipate as the sample size, using this sign to approximate: Note: Unlike quantitative methods, where there is a formula determining the sample size needed for the statistical power at a given level of confidence, there is no formula for determining the correct size for qualitative sample. [Ask participants how they have determined sampled size in their work? Let participants discuss experiences with describing this in a project. (2–3 minutes)]

Data saturation What it means for sample size planning Scope Quality
Time Cost This figure is helpful to analyze the various factors that influence the determination of qualitative sample size. Apart from the scientific factors and elements of scope, cost and time and their impact on the quality of the evaluation should also be considered.

Factors that influence sampling
Sampling depends on a variety of factors: Scope/variation Target audience Evaluators Resources—time/money Evaluation users/audience Evaluation questions Your sample size and sampling approach will depend on many things, including those factors listed here. [Instructions: Take 10 minutes and talk through each of these with the group, asking for their ideas on how each might affect the sample size and/or sampling method.]

Activity: Factors that influence
Sampling Divide into three groups; each group assigned a funder/audience type: Government International NGO Local organization How would having this entity as your funder and main audience potentially influence sampling? (20 mins.) Each group shares key points with plenary (15 mins.) [Put participants into three groups. Each group will focus on one type of funding agency (e.g., international NGO, government, local organization, doctoral student) and discuss for 20 minutes how that type of organization or individual can influence the determination of sample size.] [Then, discuss with the whole group (15 minutes).]

Reducing bias in sampling
Constantly maintaining reflexivity to be able to identify who to sample next Reflexivity highlights limitations of sampling, and you work to overcome them Transparency: Once the sampling process is documented and critiqued, researchers will likely identify biases (their own or those of others) and address them Iterative nature: Working through sampling, data collection/analysis simultaneously allows one to address issues that may arise Selecting cases based on a wide variation of characteristics can help to reduce bias [Instructions: Ask participants to describe examples/potential sources of bias. (1 min.)] [Discuss strategies to avoid bias in sampling using the slide. (5 mins.)]

Tips Questions to ask about sampling
Can the evaluator explain how the sampling strategy meets the goals of the study? Does the sampling strategy make intuitive sense? Can the evaluator justify the sampling strategy? Has the evaluator provided an approximate sample size? Can the sample size provide reasonable coverage? Presented here are questions you can ask about the sampling design for an evaluation to try to evaluate its rigor and relevance. [Instructions: Discuss these tips with participants.]

References Patton, M. (1990). Qualitative evaluation and research methods (pp. 169–186). Beverly Hills, CA: Sage Publications, Ltd. Devers, K.J., & Frankel, R. (2000) Study design in qualitative research—2: Sampling and data collection strategies. Education for Health; 13(2):263–271. Guest, G., Namey, E., & Mckenna, K. (2016). How many focus groups are enough? Building an evidence base for nonprobability sample sizes. Field Methods; 20. Office of Data Analysis, Research, and Evaluation. (2016). Qualitative research methods in program evaluation: Considerations for federal staff. Washington, DC, USA: United States Department of Health and Human Services.

Sampling Strategies and Saturation [insert date]

Similar presentations

Presentation on theme: "Sampling Strategies and Saturation [insert date]"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sampling Strategies and Saturation [insert date]

Similar presentations

Presentation on theme: "Sampling Strategies and Saturation [insert date]"— Presentation transcript:

Similar presentations

About project

Feedback