Presentation on theme: "Sampling A population is the total collection of units or elements you want to analyze. Whether the units you are talking about are residents of Nebraska,"— Presentation transcript:
Sampling A population is the total collection of units or elements you want to analyze. Whether the units you are talking about are residents of Nebraska, schools, editorials in newspapers, or local businesses, when the population is small enough, survey every element or unit of the population. The unit of analysis is the element about which you are observing and collecting data, such as a person responding to a questionnaire, a school, an editorial, or a local business. Data about the population’s variables are called the parameters. A statistic refers to the information or data we have about the variables in a sample. A selected subset of the population is called a sample. We collect data from a sample when the population is too large to study. Population parameters are what we estimate or infer from sample statistics when we collect data from a sample because we were unable to do a full population study. In most cases, there will be a difference between the information (the statistics) we gather about the sample and what the true parameters of the population are. This difference is called the sampling error.
Probability Sampling Simple Random Sampling: each unit has an equal chance of being chosen for the sample. This allows you to generalize to the population from which the sample was selected. provide a complete list of all possible units in the population from which to choose a sample. This becomes the sampling frame from which a sample can be chosen. Telephone surveys employ random sampling through random digit dialing techniques in which machines generate phone numbers within various area codes and then dial the numbers. Using a table of random numbers or computer- generated numbers ( select the units. This may require numbering each unit, or using pre-existing numbers (such as ID numbers and mailbox numbers).
Stratified Random Sampling stratify (categorize) your sample along the lines you want to analyze by establishing quotas for certain kinds of respondents divide the sampling frame into categories, let's say males and females. Within each category (stratum), you take a simple random sample until you get the desired proportion of respondents for each category. You don’t need to stratify first if your desired proportion matches the actual proportion in the sample, since simple random sampling should result in the same proportion. However, you would stratify in order to guarantee that you get the same mix as in the population. So if the population is 60% female, 40% male and this is what you want, then no need to stratify, but if you want 50% male and 50% female then you must use stratified sampling techniques. For example, imagine you have 1000 people in the population and you want a sample of 100. You decide you want an equal number of First-Year, Sophomore, Junior, and Senior students. What would you do? Now imagine you want to oversample First-year students: 40% First-year, 30% Sophomores, 20% Juniors, and 10% seniors. What do you do?
Systematic Random Sampling When there is a large number of units in the population, a quicker method is systematic random sampling. This involves taking every n th element in the sampling frame until the total is reached. Imagine you need a sample of 100 and you have 1000 people to select from. If you divide 1000 by 100, you get a sampling interval of 10. So you take every tenth person from the population list beginning with the person with some randomly selected number between 1 and 10. How would you calculate the sampling interval for a sample of 200 people you want to select from a population of 4000? Then what do you do?
Multistage or Cluster Sampling This is a technique used by the large public opinion poll organizations. It involves randomly selecting units from larger clusters and moving to smaller ones at each stage until the desired number is achieved. For example, to do a national survey of college students, your first cluster might be regions of the U.S. (Northeast, Midwest, Southeast, Northwest, etc.). Randomly select a few regions at this first stage. Then from this list of selected regions, list the states and randomly select states from this cluster. At the next step or stage, list the colleges and universities within the selected states and randomly select some colleges. Then make a list of students attending these selected colleges and in this final stage randomly select the students to send your questionnaire. At each stage, you could use any of the probability sampling methods: simple random, stratified random, or systematic random. The key point of probability sampling is the randomness of selection at each stage: each unit has an equal chance of being chosen for the sample.
Non-Probability Sampling Every element does not have an equal chance of being selected for the study, therefore, you are limited to making conclusions only about those who have completed the survey. You cannot generalize to the entire population. Convenience or accidental sampling: based on whoever just happens or accidentally to be available at a particular moment, such as those taking a particular class and attending the day of the survey, walking by on a certain street corner or section of the mall. Volunteer samples are also samples of convenience: Those who respond to a sign asking for research subjects, participate because of some incentive (a course grade or money), or hear about a survey on some web site, may be different kinds of people from those who do not even see those announcements.
Quota Sampling Like stratified random sampling, researchers occasionally want to be sure that there is some representation in the final sample. Breaking your sample into various strata can entail any number of categories, such as race/ethnicity, gender, age, and any other characteristic used to screen respondents. Respondents are solicited conveniently (accidentally) or through other non- probability techniques, until the number needed for each of the various criteria is met. The final sample could be representative of the population, reflecting the same proportion of each category as in the population. Or, quota sampling could be used to weight, oversample, or undersample certain groups if so desired. For example, because you are doing a study of religious beliefs, you decide to have several equal sized categories of religions. So you establish a quota of 25% Catholic, 25% Jewish, 25% Protestant, and 25% Atheist. If your desired sample size is 200, what do you do next?
Purposive or Judgmental Sampling Designate a group of people for selection because you know they have some traits you want to study. For example, Marketing researchers test products in a particular city because they have made the judgment that shoppers in that city represent a cross-section of potential buyers. Or perhaps you purposively select a particular dorm of students to study because you know from past research that their opinions tend to represent those of the entire campus. However, as with all non-probability sampling, you cannot generalize to the population beyond those who were sampled.
Snowball Sampling When finding a sample of people who are difficult to locate (that is, it is impossible to identify the total sampling frame), this non-probability technique is useful What researchers do is first identify a handful of people who are members of the category you wish to study, perhaps through some personal contacts or organizations. Each person who volunteers then is asked to pass along a questionnaire to (or give the research the name of) someone he or she knows who is also characterized by that category. Like a snowball rolling down the hill, the sample becomes larger and larger as it picks up more snow, that is, respondents of similar characteristics you are looking for. For example, imagine you want to do a study of skinheads, but it is impossible to get a list of all skinheads. What do you do next?
Cross-sectional vs. Longitudinal Samples RespondentsTime line Cross- sectional One setOne time Longitudinal: Panel One setTwo or more time periods Longitudinal: Trend Different peopleTwo or more time periods Longitudinal: Cohort Different people, shared characteristics Two or more time periods