Presentation is loading. Please wait.

Presentation is loading. Please wait.

5-4-1 Unit 4: Sampling approaches. 5-4-2 After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.

Similar presentations


Presentation on theme: "5-4-1 Unit 4: Sampling approaches. 5-4-2 After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical."— Presentation transcript:

1 5-4-1 Unit 4: Sampling approaches

2 5-4-2 After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical concepts in sampling Understand the need for more complex sampling designs Understand the main sampling issues and primary sampling options for BSS Understand the criteria for choosing a sampling approach

3 5-4-3 Why do we sample? We sample when we desire to measure characteristics of a specified population (e.g., the proportion of the general population who have unsafe sex) but lack the time and resources to obtain information from all members of the population. Concentrating survey time and resources on a sample may also result in better quality data than if resources were spread over the whole population.

4 5-4-4 Key definitions The target population is the population that is the ideal one for meeting a survey’s measurement objective. (For example, all commercial sex workers in a city.) The survey population is the target population modified to take into account practical considerations (For example, all commercial sex workers in a city over the age of 15, excluding those who are home- based.)

5 5-4-5 What do we want from our sample?

6 5-4-6 1.Unbiased estimates of our indicators for the survey population This requires a random/probability sample. Use the class as an example. In summary: A probability sample is one in which each person in the survey population has a known, non-zero probability of selection. Statistical tests are based on the assumption that the sample is a probability sample. A probability sample ensures that our sample is like, or can be weighted to be like, the population from which it was drawn, and the estimates of our indicators can be generalised to the larger population. Probability sampling requires a sample frame, which is a list of ‘units’ from which a sample may be selected.

7 5-4-7 A summary of probability and non- probability sampling Issue Probability sample Non-probability sample Prone to selection bias No Yes Can generalise results to survey population Yes No Can estimate precision of survey estimates (i.e., use statistical techniques) Yes No Results considered credible No Yes Requires sample frame Yes No Requires following fixed procedures that are sometimes costly or unfeasible Yes No Method replicable (important for measuring trends) Yes No

8 5-4-8 2. Precise estimates of our indicators for the survey population This requires an adequate sample size. In summary: –There are many possible samples that could be selected from the population. Because of chance, each sample would produce a different estimate. –In real life we only select one sample from the population. If we use probability sampling, we can estimate how precisely the population measure is estimated by the sample estimate. –We can increase the precision of our estimate by ensuring an adequate sample size. Standard equations are available to calculate sample size.

9 5-4-9 Problems with simple random sampling

10 5-4-10 Problem 1: Can require the selection of a large number of random numbers. Solution: Use systematic sampling (i.e., sample people at regular intervals down the sample frame). Problem 2: Sample frames for an entire target population rarely exist and are too impractical to construct. Solution: Develop a sampling frame of larger units (clusters). Randomly select clusters and construct a sample frame of individuals in the selected clusters. Randomly sample individuals within those clusters.

11 5-4-11 Notes on cluster sampling 1. All members of the target population must be included in one of the clusters on the sample frame in order to have a chance of being selected. 2. If clusters are unequal sizes, we need to take this into account to ensure that our sample is not biased by the fact that people in smaller clusters have a higher probability of being selected than those in larger clusters. We can do this by: –making the probability that a cluster is sampled dependent on its size –adjusting for cluster size during the analysis.

12 5-4-12 3. Cluster sampling results in less precise estimates of our indicators than simple random sampling. As respondents within clusters may be similar to each other, we need to compensate for this by increasing the sample size. Notes on cluster sampling, cont.

13 5-4-13 Problem 3: Populations can be spread over a wide area, making logistics difficult. Solution: Use cluster sampling, as it concentrates fieldwork in specific clusters. Problem 4: The population consists of distinct sub- groups that we are interested in. Solution: Make precise estimates for each sub-group (‘strata’) by using stratified sampling (i.e., take a sample of adequate size from each strata). If we want an estimate for the entire population, we can combine the estimates for the strata if we know the proportion of the population in each strata.

14 5-4-14 Sampling issues in behavioural surveillance 1.Consistent sampling is required across survey rounds: If sampling changes between rounds, we don’t know if any observed changes are real or a result of changes in methodology. 2.General populations can rarely be used to access high-risk groups: Group members may not be found in households in sufficient numbers and may not want to talk in household settings. Instead, the locations where group members congregate can be defined as clusters.

15 5-4-15 Examples of possible clusters for high-risk groups

16 5-4-16 3. Cluster sampling is difficult when clusters are not stable. –A measure of cluster size is needed for cluster sampling. It is difficult to estimate cluster size when we use locations like sex worker sites as clusters, because the people in each cluster are rarely fixed. –The risk behaviour in a cluster may also vary by time of day. This makes it difficult to select a sample that is representative of the entire target population using conventional cluster sampling. Sampling issues for behavioural surveillance, cont.

17 5-4-17 4. Members of high-risk groups may be difficult to identify and access. 5. Cluster sampling is impossible if group members do not congregate. Some groups do not congregate at all. In others, only some members of the population congregate and important sections of the group may be missed. Sampling issues for behavioural surveillance, cont.

18 5-4-18 Potential solutions to sampling challenges Use different sampling strategies for different groups. Use conventional sampling methods in unconventional ways. Consider using experimental sampling techniques such as Respondent Driven Sampling (RDS).

19 5-4-19 Sampling options for behavioural surveillance

20 5-4-20 Conventional cluster sampling Appropriate for the general population, youth and a few high-risk groups, such as prisoners.

21 5-4-21 Time location sampling Use when high-risk groups congregate, but their clusters are not stable. Allows locations to be included as clusters more than once (e.g., at different times of the day or on different days of the week). Clusters are defined by both location and time. –For example: Cluster 1= Site 1 weekday afternoon Cluster 2= Site 2 weekday evening Cluster 3= Site 1 weekend Cluster 4= Site 2 weekday afternoon Cluster 5= Site 1 weekday evening Cluster 6= Site 2 weekend

22 5-4-22 Time location sampling, cont. This means: –The fact the cluster size is not fixed is not a problem, as we only need to know the number of individuals associated with the cluster at the sampling time interval. –The fact that the type of person in the location varies by time is not a problem, as the location is included at different times.

23 5-4-23 Respondent-Driven Sampling Use when high-risk groups do not congregate Steps: 1.Start with initial contacts or ‘seeds,’ who are surveyed and then become recruiters. 2.Each recruiter invites up to three people they know in the high-risk group to be interviewed. 3.The new recruits become the recruiters. 4.Five to six recruitment waves occur.

24 5-4-24 Theory behind respondent- driven sampling Given sufficiently long referral chains (five to six of the people you started with), the final sample will be like the network from which we recruit. By keeping track of the links between recruiters and recruits and the size of people’s networks, we can calculate the probability of selection and estimate how precisely the population measure is estimated by the sample estimate.

25 5-4-25

26 5-4-26 Sample size calculation The sample size can be based on the number of participants needed to detect a change in each round (or year) in the proportion of an indicator from one round to the next. [Z 1-   2P (1-P) + Z 1-   P 1 (1- P 1 ) + P 2 (1-P 1 )] 2 (P 2 – P 1 ) 2 Where: Z 1-α = The z score for the desired confidence level Z 1-β = The z score for the desired power P 1 = The proportion of the sample reporting indicator in year 1 P 2 = The proportion of the sample reporting indicator in year 2 P = (P 1 + P 2 )/2 n= D

27 5-4-27 Sample size calculation, cont. D design effect. The design effect can be thought of as a correction factor for how much a cluster sample differs from a simple random sample. The design effect accounts for the similarities people have when they are sampled within the same cluster. –The bigger the D, the larger the sample size needed.

28 5-4-28 Sample size calculation, cont. P 1 and P 2. P 1 and P 2 are the measures of interest for which you wish to see a change between survey rounds. –The smaller the change you wish to detect, the larger the sample size you will need. –The closer P 1 and P 2 are to 50%, the larger the sample size you will need.

29 5-4-29 Sample size calculation, cont. Z 1-α. The Z 1-α score is a statistic that corresponds to the level of significance desired. – The smaller the significance level (i.e., higher confidence level), the larger the sample size you will need. Z 1-β. The Z 1-β score is a statistic that corresponds to the power desired. –The higher the power, the larger the sample size you will need.

30 5-4-30 Indicator level in wave 1 (P1) Indicator level in wave 2 (P2) Sample size needed each wave with a design effect of 1.25 Sample size needed each wave with a design effect of 2.0.10.20.10.25 247395.20.30 123197.20.35 363581.30.40 171274.30.45 441706.40.50 201322.40.55 480768.50.60 214343.50.65 480768.60.70 210336.60.75 441706.70.80 188301.70.85 363581.80.90 149239.80.95 247 93 395 149 Table 4.5. Pre-calculated sample size estimates

31 5-4-31 Example of sample size calculation Suppose you are planning a survey of sex workers using a two-stage cluster design. You wish to show that condom use will increase from 20% in the baseline survey (this year) to 30% or greater in the survey wave next year. How many sex workers do you need to include each year?

32 5-4-32 Example of sample size calculation, cont. Solution: D= 2 (moderate) Z 1-α =1.96 (95% confidence level) Z 1-β = 0.83 (80% power) P 1 = 20% condom use in year 1 P 2 = 30% condom use in year 2 P= (.20 +.30)/2 =.25 N = 2 {1.96 SQT[2x.25(1 -.25)] + 0.83 SQT[.20(1-.20) +.30(1-.3))]} 2 /(.30 -.20) 2 = 582 sex workers per survey wave

33 5-4-33 Small group discussion a. What sampling strategies have you had experience with? b. What difficulties and successes did you have with the strategy?

34 5-4-34 Case study For each of the following groups, decide what is the best sampling strategy. Why this is the best strategy? What are the strong and weak points of using this method for the group? a. Group 1: Youth b. Group 2: MSM


Download ppt "5-4-1 Unit 4: Sampling approaches. 5-4-2 After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical."

Similar presentations


Ads by Google