Chapter 6 Sampling
Introduction Sampling - The process of selecting observations Often not possible to collect information from all persons or other units you wish to study Often not necessary to collect data from everyone out there Allows researcher to make a small subset of observations and then generalize to the rest of the population
OBSERVATION AND SAMPLING Polls and other forms of social research rest on observations The task of researchers is to select the key aspects to observe (sample) Generalizing from a sample to a larger population is called probability sampling and involves random selection
POPULATIONS AND SAMPLING FRAMES Findings based on a sample represent the aggregation of elements that compose the sampling frame All elements must have equal representation in the frame
SAMPLING FRAME That list or quasi list of units composing a population from which a sample is selected If the sample is to be representative of the population, it is essential that the sampling frame include all (or nearly all) members of the population
REPRESENTATIVENESS Representativeness - Quality of a sample having the same distribution of characteristics as the population from which it was selected EPSEM - Equal probability of selection method. A sample design in which each member of a population has the same chance of being selected into the sample
POPULATION The theoretically specified aggregation of study elements Study population - Aggregation of elements from which the sample is actually selected Element - Unit about which information is collected and that provides the basis of analysis
RANDOM SELECTION Each element has an equal chance of selection independent of any other event in the selection process
PARAMETER VS. STATISTIC Summary description of a given variable in a population Summary description of a variable in a sample
The Logic of Probability Sampling Enables us to generalize findings from observing cases to a larger unobserved population Representative - Each member of the population has a known and equal chance of being selected into the sample Since we are not completely homogeneous, our sample must reflect – and be representative of – the variations that exist among us
Conscious and Unconscious Sampling Bias What is the proportion of FAU students who have been to an FAU football game? Be conscious of bias – When sample is not fully representative of the larger population from which it was selected Equal Probability of Selection Method (EPSEM) A sample is representative if its aggregate characteristics closely match the population’s aggregate characteristics; basis of probability sampling
Probability theory and sampling distribution Sample Element: Who or what are we studying (student) Population: Whole group (college freshmen) Population Parameter: The value for a given variable in a population Sample Statistic: The summary description of a given variable in the sample; we use sample statistics to make estimates or inferences of population parameters
Probability theory and sampling distribution Purpose of sampling: To select a set of elements from a population in such a way that descriptions of those elements (sample statistics) accurately portray the parameters of the total population from which the elements are selected The key to this process is random selection Sampling Distribution: The range of sample statistics we will obtain if we select many samples
From sampling distribution to parameter estimate Sampling Frame: list of elements in our population By increasing the number of samples selected and interviewed increased the range of estimates provided by the sampling operation
Estimating sampling error If many independent random samples are selected from a population, then the sample statistics provided by those samples will be distributed around population parameter in a known way Probability theory gives us a formula for estimating how closely the sample statistics are clustered around the true value Standard Error: A measure of sampling error Tells us how sample statistics will be dispersed or clustered around a population parameter
Confidence levels and confidence intervals Two key components of sampling error We express the accuracy of our sample statistics in terms of a level of confidence that the statistics fall within a specified interval from the parameter The logic of confidence levels and confidence intervals also provides the basis for determining the appropriate sample size for a study
Probability theory & sampling distribution summed up Random selection permits the researcher to link findings from a sample to the body of probability theory so as to estimate the accuracy of those findings All statements of accuracy in sampling must specify both a confidence level and a confidence interval The researcher must report that he or she is x percent confident that the population parameter is between two specific values
Probability sampling: populations & sampling frames Different types of probability sampling designs can be used alone or in combination for different research purposes Key feature of all probability sampling designs: the relationship between populations and sampling frames Sampling frame: The quasi-list of elements from which a probability sample is selected
TYPES OF PROBABILITY OF PROBABILITY SAMPLING DESIGN Simple random sampling (SRS) Systematic sampling Stratified sampling Cluster sampling
Simple random sampling Each element in a sampling frame is assigned a number, choices are then made through random number generation as to which elements will be included in your sample Forms the basis of probability theory and the statistical tools we use to estimate population parameters, standard error, and confidence intervals Feasible only with the simplest sampling frame Not the most accurate method available
A SIMPLE RANDOM SAMPLE
Systematic sampling Systematic Sampling – Elements in the total list are chosen (systematically) for inclusion in the sample List of 10,000 elements, we want a sample of 1,000, select every tenth element Choose first element randomly Danger: “Periodicity" A periodic arrangement of elements in the list can make systematic sampling unwise Slightly more accurate than simple random sampling Arrangement of elements in the list can result in a biased sample
Stratified sampling Stratified sampling: Ensures that appropriate numbers are drawn from homogeneous subsets of that population Method for obtaining a greater degree of representativeness—decreasing the probable sampling error Disproportionate stratified sampling: Way of obtaining sufficient # of rare cases by selecting a disproportionate # To purposively produce samples that are not representative of a population on some variable
Stratification Grouping of units composing a population into homogenous groups before sampling This procedure, which may be used in conjunction with simple random, systematic, or cluster sampling, improves the Representativeness of a sample, at least in terms of the stratification variables
Stratified Sampling Rather than selecting sample for population at large, researcher draws from homogenous subsets of the population Results in a greater degree of representativeness by decreasing the probable sampling error
A SRATIFIED, SYSTEMATIC SAMPLE WITH A RANDOM START
CLUSTER SAMPLING A multistage sampling in which natural groups are sampled initially with the members of each selected group being subsampled afterward.
MULTISTAGE CULUSTER SAMPLING Used when it's not possible or practical to create a list of all the elements that compose the target population Involves repetition of two basic steps: listing and sampling Highly efficient but less accurate
Multistage cluster sampling Compile a stratified group (cluster), sample it, then subsample that set... May be used when it is either impossible or impractical to compile an exhaustive list of the elements that compose the target population, (Ex.: All law enforcement officers in the US) Involves the repetition of two basic steps: Listing Sampling
National Crime Victimization Survey Seeks to represent the nationwide population of persons 12+ living in households (≈ 42K units, 74K occupants in 2004) First defined are primary sampling units (PSUs) Largest are automatically included, smaller ones are stratified by size, population density, reported crimes, and other variables into about 150 strata Census enumeration districts are selected (CED) Clusters of 4 housing units from each CED are selected
British Crime Survey First stage – 289 Parliamentary constituencies, stratified by geographic area and population density Two sample points were selected, which were divided into four segments with equal #’s of delivery addresses One of these four segments was selected at random, then disproportionate sampling was conducted to obtain a greater number of inner-city respondents Household residents aged 16+ were listed, and one was randomly selected by interviewers (n=37,213 in 2004)
NONPROBABILITY SAMPLING Technique in which samples are selected in a way that is not suggested by probability theory Reliance on available subjects: Only justified if less risky sampling methods are not possible Researchers must exercise caution in generalizing from their data when this method is used
Nonprobability Sampling Purposive sampling: Selecting a sample on the basis of your judgment and the purpose of the study Quota sampling: Units are selected so that total sample has the same distribution of characteristics as are assumed to exist in the population being studied Reliance on available subjects Snowball sampling - You interview some individuals, and then ask them to identify others who will participate in the study, who ask others…etc.
Purposive (Judgmental) Sampling Selecting a sample based on knowledge of a population, its elements, and the purpose of the study Used when field researchers are interested in studying cases that don’t fit into regular patterns of attitudes and behaviors
Snowball Sampling Appropriate when members of a population are difficult to locate Researcher collects data on members of the target population she can locate, then asks them to help locate other members of that population
Quota Sampling Begin with a matrix of the population Data is collected from people with the characteristics of a given cell Each group is assigned a weight appropriate to their portion of the population Data should represent the total population
Survey Research and Other Ways of Asking Questions Chapter 7 Survey Research and Other Ways of Asking Questions
introduction Survey research is perhaps the most frequently used mode of observation in sociology and political science, and surveys are often used in criminal justice research as well You have no doubt been a respondent in some sort of survey, and you may have conducted a survey yourself
Survey research is the most frequently used method Fast Cheap Individual as the unit of analysis (usually) Cross-sectional All types of research (exploration, description, explanation, application)
Steps in Survey Research Target population Types of respondent Types of survey Develop the questionnaire Pre/pilot test the instrument Plan a system for recording answers
Topics Appropriate to Survey Research Counting Crime: asking people about victimization counters problems of data collected by police Self-Reports: dominant method for studying the etiology of crime Frequency/type of crimes committed Prevalence (how many people commit crimes) committed by a broader population
Topics Appropriate to Survey Research Perceptions and Attitudes: To learn how people feel about crime and CJ policy Targeted Victim Surveys: Used to evaluate policy innovations & program success Other Evaluation Uses: e.g., Measuring community attitudes, citizen responses, etc. Chicago Community Policing Evaluation Consortium General Purpose Crime Surveys
Guidelines for Asking Questions How questions are asked is the single most important feature of survey research Open-Ended: Respondent is asked to provide his or her own answer Closed-Ended: Respondent selects an answer from a list Choices should be exhaustive and mutually exclusive Questions and Statements – (Likert scale)
Types of Questions Open-ended questions Respondent is asked to provide his or her own answer to the question Closed-ended questions Respondent is asked to select an answer from among a list provided by the researcher
Guidelines for asking questions Make Items Clear: Avoid ambiguous questions; do not ask “double-barreled” questions Short Items are Best: Respondents like to read and answer a question quickly Avoid Negative Items: Leads to misinterpretation Avoid Biased Items and Terms: Do not ask questions that encourage a certain answer Designing Self-Report Items: Use of computer assisted interviewing techniques
Questionnaire Construction General questionnaire format – critical, must be laid out properly – uncluttered Be aware of issues with ordering items Include instructions for the questionnaire Pretest all or part of the questionnaire Contingency Questions: Relevant only to some respondents – answered only based on their previous response Matrix Questions: Same set of answer categories used by multiple questions
Guidelines for Questionnaire Construction Be aware of issues with ordering items. Include instructions for the questionnaire. Pretest all or part of the questionnaire.
Contingency Question Survey question intended only for some respondents, determined by their response to some other questions
Contingency Question Format
Matrix Question Format
Ordering Questions in a Questionnaire Ordering may affect the answers given Estimate the effect of question order Perhaps devise more than one version Begin with most interesting questions End with duller, demographic data This is opposite for in-person interview surveys
MAIL SURVEY Costs Warning letters Consents Follow up mailings Postage Response rate
MAIL SURVEY: RESPONSE RATE Number of people participating in a survey divided by the number selected in the sample Acceptable response rate 50% - adequate for analysis and reporting 60% - good 70% - very good
Self-Administered Questionnaires Can be home-delivered Researcher delivers questionnaire to home of sample respondent, explains the study, and then comes back later Mailed (sent and returned) survey is most common Researchers must reduce the trouble it takes to return a questionnaire
Warning Mailings & Cover Letters Used to increase response rates Warning Mailings: “Address correction requested” card sent out to determine incorrect addresses and to “warn” residents to expect questionnaire in mail Cover Letters: Detail why survey is being conducted, why respondent was selected, why is it important to complete questionnaire Include institutional affiliation or sponsorship
Other Aspects of Self-Administered Questionnaires Monitoring returns: Pay close attention to the response rate, assign #’s serially Follow-up mailings: Nonrespondents can be sent a letter, or a letter and another questionnaire; timing Acceptable response rates: 50%? 60%? 70%? We would rather have a lack of response bias than a high response rate?
Computer-Based Self-Administration Via Fax, Email, Web Site/Page Issues Representativeness Mixed in with, or mistaken for, spam Requires access to Web Sampling frame?
In-Person Interview Surveys Typically achieve higher response rates than mail surveys (80-85% is considered good) Demeanor and appearance of interviewer should be appropriate; interviewer should be familiar with questionnaire and ask questions precisely When more than one interviewer administers, efforts must be coordinated and controlled Practice interviewing
GUIDELINES FOR INTERVIEW SURVEY Dress in a similar manner to the people who will be interviewed. Study and become familiar with the questionnaire. Follow question wording exactly. Record responses exactly. Probe for responses when necessary.
TRAINING FOR INTERVIERS Discussion of general guidelines and procedures. Specify how to handle difficult or confusing situations. Conduct demonstration interviews. Conduct “real” interviews.
Computer-Assisted Interviews Reported success in enhancing confidentiality Reported higher rates of self-reporting Computer-assisted personal interview (CAPI) – Interviewers read questions from screens and then type in answers from respondents’ Computer-assisted self-interviewing (CASI) – Respondent keys in answers, which are scrambled so that interviewer cannot access them
Telephone Surveys 95.5% of all households have telephones (2005, US Census Bureau) Random-Digit Dialing Obviates unlisted number problem Often results in business, pay phones, fax lines Saves money and time, provides safety to interviewers, more convenient May be interpreted as bogus sales calls; ease of hang-ups
TELEPHONE SURVEYS Advantages: Disadvantages: Money and time Control over data collection Disadvantages: Surveys that are really ad campaigns Representativeness
Computer-Assisted Telephone Interviewing (CATI) A set of computerized tools that aid telephone interviewers and supervisors by automating various data collection tasks Easier, faster, more accurate but more expensive Formats responses into a data file as they are keyed in Can automate contingency questions and skip sequences
Comparison of the Three Methods Self-administered questionnaires are generally cheaper, better for sensitive issues than interview surveys Using mail: Local and national surveys are same cost Interviews: More appropriate when respondent literacy may be a problem, produce fewer incompletes, achieve higher completion rates Validity low, reliability high in survey research Surveys are also inflexible, superficial in coverage
Specialized Interviewing Two variations: General interview guide: Less structured, lists issues, topics, questions you wish to cover; no standardized order Standardized open-ended interview: More structured, specific questions in specific order; useful in case studies, retrieves rich detail in responses
Focus Groups 12-15 people brought together to engage in guided group discussion of some topic Members are selected to represent a target population, but cannot make statistical estimates about population Most useful when precise generalization to larger group is not necessary May be used to guide interpretation of questionnaires following survey administration
STRENGTHS OF SURVEY RESEARCH Useful in describing the characteristics of a large population Make large samples feasible Flexible - many questions can be asked on a given topic
WEAKNESSES OF SURVEY RESEARCH Can seldom deal with the context of social life Inflexible in some ways Subject to artificiality Weak on validity
Should You Do It Yourself? Consider start-up costs Finding, training, paying interviewers is time consuming and not cheap, and requires some expertise Mail surveys are less expensive, and can be conducted by 1-2 persons well The method you use depends on your research question