Presentation is loading. Please wait.

Presentation is loading. Please wait.

SAMPLING Purposes Representativeness “Sampling error”

Similar presentations


Presentation on theme: "SAMPLING Purposes Representativeness “Sampling error”"— Presentation transcript:

1 SAMPLING Purposes Representativeness “Sampling error”

2 Review: essential definitions
Population (N=size) Largest group to which we intend to project (apply) the findings of a study All prisoners in Jay’s prison / all students in his class “Parameter” - any statistic (e.g., mean) of a population Sample (n=size) Any subgroup of the population Samples intended to represent a population must be selected in special ways (will come up later) Unit of analysis The “container” for the variables Here, the variables under study are sentence length and type of crime (property or violent) What “contains” them? Prisoners! Case A single occurrence of a unit of analysis Here, it’s any one prisoner Cases are “members” or “elements” of the population from which one or more samples are drawn Sampling frame A list of all “elements” or members of the population Jay’s prison / Jay’s class Population Sample

3 Purposes & representativeness
Purposes of sampling Descriptive: Describe characteristics of a population without having to measure every member (e.g., age, height, gender) Explanatory: Help test hypotheses of cause and effect (e.g., gender determines height) Representativeness: Samples should accurately reflect, or represent, the population from which they are drawn We will be exploring ways to make samples “representative” If a sample is representative, we can apply findings from that sample - “make inferences” - to the population from which the sample was drawn 21 28 70 M 68 62 F 26 63 23 22 67 24 73 29 25 65 Population parameters Sample statistics 10 24.00

4 Sampling error Sampling error: Unintended differences between a population parameter and the equivalent statistic from an unbiased sample Inevitable result of sampling Try it out in class! Calculate the population mean for age. Then take samples of different size and calculate their mean. Any difference between a population parameter and a sample statistic is “sampling error.” It should decrease as sample size increases Rule of thumb To minimize sampling error, sample size should be at least 30 for populations up to about 500 For larger populations sample sizes should be larger Population parameters Sample 1 statistics 3 Sample 2 statistics 10

5 PROBABILITY SAMPLING With/without replacement Simple random sampling
Stratified random sampling / proportionate Stratified random sampling / disproportionate

6 Sampling with / without replacement
With replacement: Return each case to the population before drawing the next Makes it possible to redraw the same case (not good) Keeps the probability of a case being drawn the same from beginning to end (good) Without replacement: Drawn cases are not returned to the population Probability of undrawn cases being selected increases as cases are drawn Sampling without replacement is by far the most common Most sampling frames are sufficiently large so that as cases are drawn changes in the probability that any particular case will be drawn are small X X

7 Simple random sampling
In simple random sampling we draw at random from the entire population 21 28 70 M 68 62 F 26 63 23 22 67 24 73 29 25 65

8 Using simple random sampling to describe a population
Population: 200 inmates Mean sentence: 2.94 years Assignment Draw a random sample of 10 and compare its mean to the population parameter. Then do the same with a random sample of 30. How much error is there? Does it change with sample size? Data from Jay’s correctional center Koko Wachtel, warden Frequency (# prisoners) Sentence length in years

9 Stratified random sampling - proportionate
(M=21 F=10) n=20 (M=14 F=6) n=14 n=6 F M If you did a good job randomly sampling, the size of each group - its “n”, or number of cases - should be roughly proportionate to that score or value’s proportion in the population Proceed with your analysis “Stratify” - group these cases according to their value or score on your variable of interest These groups are called “strata” (sing., “stratum”) Draw a random sample (say, n=20) from the population

10 Stratified random sampling - disproportionate
First, designate a variable of interest (gender) Separate the population into subgroups (“strata”) that correspond with the variable’s values (M, F) Draw random samples of equal size from each subgroup Proceed with your analysis

11 Using stratified random sampling to describe a population
Population 200 inmates; mean sentence 2.94 years Assignment Draw a random sample of 30 from each stratum and compare its mean to the corresponding population parameters. How much error is there? Property crimes: Mean sentence: 2.88 Violent crimes: 50 Mean sentence: 3.12

12 Using stratified random sampling to test a hypothesis - exercises
1 2 Hypothesis: A pre-existing personal relationship between criminal and victim is more likely in violent crimes than in crimes against property You have full access to crime data for Sin City. These statistics show that in 2014 there were 200 crimes, of which 75 percent were property crimes and 25 percent were violent crimes. For each crime, you know whether the victim and the suspect were acquainted (yes/no). Hypothesis: Male cops are more cynical than female cops Sin City Police Department has 200 officers; 150 are male and 50 are female. 1. Identify the population.  How would you sample proportionally? How would you sample disproportionately? In this example which of the above is preferable? Why?

13 randomly select 30 cases (15% of the population)
Stratified proportionate random sampling Hypothesis: A pre-existing personal relationship between criminal and victim is more likely in violent crimes than in crimes against property Sin City 200 crimes in 2014 50 violent (25 %) 150 property (75 %) randomly select 30 cases (15% of the population) (expect 7.5 violent – 25%) (expect 22.5 property – 75%) Compare proportions within each where suspect and victim were acquainted BUT: The frequency (number of cases) for violent crime is very small!

14 randomly select 30 cases from each category
Stratified disproportionate random sampling Hypothesis: A pre-existing personal relationship between criminal and victim is more likely in violent crimes than in crimes against property Sin City 200 crimes in 2014 50 violent (25 %) 150 property (75 %) randomly select 30 cases from each category 30 property 30 violent Compare proportions within each where suspect and victim were acquainted Note: don’t recombine these into a single sample to compute a mean!

15 Hypothesis: Male cops are more cynical than female cops
Stratified proportionate random sampling Hypothesis: Male cops are more cynical than female cops 150 male (75 %) Sin City 200 officers 50 female (25 %) randomly select 30 officers expect 22.5 males expect 7.5 females Compare average cynicism scores BUT: The frequency (number of cases) for females is very small!

16 Hypothesis: Male cops are more cynical than female cops
Stratified disproportionate random sampling Hypothesis: Male cops are more cynical than female cops Sin City 200 officers 150 male (75 %) 50 female (25 %) randomly select 30 officers from each stratum 30 males 30 females Compare average cynicism scores Note: don’t recombine these into a single sample to compute a mean!

17 OTHER SAMPLING TECHNIQUES
Quasi-probability sampling: systematic sampling, cluster sampling Non-probability sampling

18 Quasi-probability sampling
Systematic sampling Randomly select first element, then choose every 5th, 10th, etc. depending on the size of the sampling frame (number of cases or elements in the population) If done with care can give results equivalent to fully random sampling Caution: if elements in the sampling frame are ordered in a particular way a non-representative sample might be drawn Cluster sampling Method Divide population into equal-sized groups (clusters) chosen on the basis of a neutral characteristic Draw a random sample of clusters. The study sample contains every element of the chosen clusters. Often done to study public opinion (city divided into blocks) Rule of equally-sized clusters usually violated The “neutral” characteristic may not be so and affect outcomes! Since not everyone in the population has an equal chance of being selected, there may be considerable sampling error

19 Non-probability sampling
Accidental sample Subjects who happen to be encountered by researchers Example – observer ride-alongs in police cars Quota sample Elements are included in proportion to their known representation in the population Purposive/“convenience” sample Researcher uses best judgment to select elements that typify the population Example: Interview all burglars arrested during the past month Issues Can findings be “generalized” or projected to a larger population? Are findings valid only for the cases actually included in the samples?

20 Practical exercise: SYSTEMATIC SAMPLING

21 Class assignment - systematic sampling
Hypothesis: Higher income persons drive more expensive cars - Income  Car Value Independent variable: income Categorical, nominal: student or faculty/staff Dependent variable: car value Categorical, ordinal: 1 (cheapest), 2, 3, 4 or 5 (most expensive) Panel assignment (worth 5 points) Select a panel coordinator Visit a student lot Select ten vehicles in each lot using systematic sampling Use the operationalized car values to code each car’s value Give each team member a filled-in copy and turn one in per team next week The copy you turn in must have the printed name and signature of each panelist who participated in collecting data PLEASE BRING THIS FORM TO EVERY CLASS SESSION!

22


Download ppt "SAMPLING Purposes Representativeness “Sampling error”"

Similar presentations


Ads by Google