Why sample? Diversity in populations Practicality and cost.
Published byModified over 4 years ago
Presentation on theme: "Why sample? Diversity in populations Practicality and cost."— Presentation transcript:
1 Why sample?Diversity in populationsPracticality and cost
2 TermsPopulation = large group about which conclusions are drawn. Real, but unknown.Sample = small group that represents population. Real, known.SampleSampleSampleSampleSampleSamplePopulationSampleSample
3 Element = individual member of a population. Sampling unit = element or group of elements selected in a sample.Unit of analysis = element or group of elements compared in the analysisThe above units can be the same or different.
4 Element, Sampling Unit, Unit of Analysis: Examples Opinion Survey of UMD StudentsElement = individual studentSampling unit = individual studentUnit of analysis = individual student (student opinions measured)Survey of family incomesElement = adult household memberSampling unit = household or addressUnit of analysis = family (total family income measured)
5 Element, Sampling Unit, Unit of Analysis: More Examples Voter PollsElement = individual voterSampling unit = telephone numberUnit of analysis = individual voter(voter opinions measured)U.S. Census of housingElement = household or addressSampling unit = household or addressUnit of analysis = household or address(# of rooms measured)
6 Sampling frame = list of all the sampling units in the population Sampling frame = list of all the sampling units in the population. Needed for probability sampling.Probability sample = researcher knows and controls the probability of selection.Main advantage: Only probability samples permit accurate estimation of sampling error.
7 Simple Random SampleEvery element in the population has an equal and constant chance of selection1. Physical sampling with replacement2. Table of random numbers3. Random selection by computerProbability of selection = Sample Size/ Pop. sizeRequires list (frame) of all elements in population
8 Systematic Random Sample Every “kth” element is drawn from a list. (e.g. every 50th name)1. K = sampling interval = Pop. Size/Sample size (e.g. 5000/100).2. Random starting point between 1 and K (e.g. 1 and 50).3. Statistically equivalent to simple random sample)4. List must be randomly ordered.5. Convenient, since lists are available for many populations
9 Stratified Random Sample Population is first divided into groups (strata).Simple random sample is taken from within each stratumSeparate random samples are combined into a single total sample.
10 Example of stratified sample SeniorsSample 2JuniorsSampleSampleUMD PopulationSample 3SophomoresFreshmenSample 4Sample 1Sample 1Stratified SampleSample 2Sample 2Sample 3Sample 3Sample 4Sample 4
11 Considerations in Stratified Sampling Requires knowledge of stratifying variableBest used when there is much variation between strata in variable being measured (Example: Stratify by year in school if measuring opinions of advising)Lowest sampling errorMost costly
12 Sampling error = estimated difference between sample value and actual population value (e.g. + 3%)
13 Cluster SampleElements in population are naturally grouped together (“clusters”)Simple random sample of clusters is takenEvery element in selected clusters is studied.Population:Sample
14 Considerations in Cluster Sampling Best when there is little variation between clusters in variable being measured.Does not require a list of individual elements (only clusters).May be used to cover large geographic area (smaller areas = clusters)May be less expensiveHighest sampling error.
15 Multistage Designs Combines two or more sampling designs. Example: sampling voters in MNStage 1: Stratify by geographic area (e.g. county)Stage 2: Sample census tracts (clusters) in selected counties.Stage 3: Take SRS of households in each tract.Commonly used in large, diverse populationsDesign is best left to experts!
16 Sampling Why use sampling? Terms and definitions Probability Sampling DesignsSimple randomSystematicStratifiedClusterMultistage designsEstimation from samples
17 Estimation from Samples Find a likely range of values for a population parameter (e.g. average, %)Parameter = characteristic of a populationStatistic = characteristic of a sampleStatistical inference = drawing conclusions about a population based on sample dataUsually connected with a probability of error.
18 Sampling Distribution Distribution of results of all possible samples of size N taken from same populationTheoretical, not actually done in practiceProperties of sampling distributions are known to statisticiansUsed as basis for inferring from samples to populations
19 Example: estimating proportion of homes with internet access Suppose population proportion = .62Take 1 sample of size 200 homes have internet access. Sample p = .60Can we conclude that the population proportion is .60?A different sample might produce a different answer
20 What if we took all possible samples? Most sample proportions would be close to population valueA few would be much higher or lowerAverage of sample proportions would be the true population proportionDistribution would be a bell-shaped curve% of samplesAll possible sample proportions
21 What we know from sampling distribution: We DON’T know the true population proportion.We DO know how many sample proportions fall within a given distance of the true proportion.Sampling error = estimated difference between sample value and actual population value(example: 95% of sample proportions fall within +3% of true proportion)
22 How we make an estimate Find sample proportion Add sampling error (margin of error) on either sideTrue proportion probably falls within this interval% of samplesAll possible sample proportions% of samplesAll possible sample proportionspppp
23 Examples of estimatesIf 95% of sample proportions (p) fall within + 3% of true proportion, then 95% of all intervals p will contain true population proportion.If p = .6, we estimate the true proportion is = .57 to .63If p = .62, we estimate the true proportion is = .59 to .65If p = .57, we estimate the true proportion is = .54 to .60If p = If p = .7, we estimate the true proportion is = .67 to .7395% of the time this procedure yields a correct estimate.