Presentation is loading. Please wait.

Presentation is loading. Please wait.

Formalizing the Concepts: Simple Random Sampling.

Similar presentations


Presentation on theme: "Formalizing the Concepts: Simple Random Sampling."— Presentation transcript:

1 Formalizing the Concepts: Simple Random Sampling

2 Purpose of sampling To study a sample of the population to acquire knowledge –by observing the units selected typified by households, persons, institutions, or physical objects – and making quantitative statements about the entire population

3

4 Purpose of sampling  Why sampling? Saves cost compared to full enumeration Saves cost compared to full enumeration Easier to control quality of sample Easier to control quality of sample More timely results from sample data More timely results from sample data Measurement can be destructive Measurement can be destructive

5 Unit of analysis  An object on which a measurement is taken  Most common units of analysis are persons, households, farms, and economic establishments Some concepts used in Sampling

6 Target population or universe  The complete collection of all the units of analysis to study.  Examples: population living in households in a country; students in primary schools Some concepts used in Sampling

7 Sampling frame  List of all the units of analysis whose characteristics are to be measured  Comprehensive, non-overlapping and must not contain irrelevant elements  Should be updated to ensure complete coverage  Examples: list of establishments; census; civil registration Some concepts used in Sampling

8 Parameter  Quantity computed from all N values in a population set  Typically, a descriptive measure of a population, such as mean, variance Poverty rate, average income, etc. Poverty rate, average income, etc.  Objective of sampling is to estimate parameters of a population Some concepts used in Sampling

9  Estimator - mathematical formula or function using sample results to produce an estimate for the entire population  Estimate - numerical quantity computed from sample observations of a characteristic and intended to provide information about an unknown population value (parameter).  Examples: mean (average), total, proportion, ratio Estimation Some concepts used in Sampling

10  When the mean of individual sample estimates equals the population parameter, then the estimator is unbiased  Formally, an estimator is unbiased if the expected value of the (sample) estimates is equal to the (population) parameter being estimated Unbiased estimator Some concepts used in Sampling

11 Random sampling  Also known as scientific sampling or probability sampling  Each unit has a non-zero and known probability of selection  Mathematical theory is available to assess the sampling error (the error caused by observing a sample instead of the whole population).

12 Random sampling techniques  Single stage, equal probability sampling Simple Random Sampling (SRS) Simple Random Sampling (SRS) Systematic sampling with equal probability Systematic sampling with equal probability  Stratified sampling  Multi-stages sampling In real life those techniques are usually combined in various ways – most sampling designs are complex

13 Single stage, equal probability sampling  Random selection of n “units” from a population of N units, so that each unit has an equal probability of selection N (population ) → n (sample) N (population ) → n (sample) Probability of selection (sampling fraction) = f = n/N Probability of selection (sampling fraction) = f = n/N Is the most basic form of probability sampling and provides the theoretical basis for more complicated techniques Random sampling techniques

14 Single stage, equal probability sampling (continued) 1. Simple Random Sampling. The investigator mixes up the whole target population before grabbing “n” units. 2. Systematic Random Sampling. The N units in the population are ranked 1 to N in some order (e.g., alphabetic). To select a sample of n units, calculate the step k ( k= N/n) and take a unit at random, from the 1st k units and then take every k th unit. Random sampling techniques

15  Advantage self-weighting (simplifies the calculation of estimates and variances) self-weighting (simplifies the calculation of estimates and variances)  Disadvantages Sample frame may not be available Sample frame may not be available May entail high transportation costs May entail high transportation costs Single stage, equal probability sampling (continued) Random sampling techniques

16 Stratified sampling  The population is divided into mutually exclusive subgroups called strata.  Then a random sample is selected from each stratum. Random sampling techniques

17 Two-stage sampling  Units of analysis are divided into groups called Primary Sampling Units (PSUs)  A sample of PSUs is selected first  Then a sample of units is chosen in each of the selected PSUs Random sampling techniques This technique can be generalized (multi- stage sampling)

18 Random sampling  Estimates obtained from random samples can be accompanied by measures of the uncertainty associated with the estimate.  The uncertainty is measured by the standard error. Confidence intervals around the estimate can be calculated taking advantage of the Central Limit Theorem.

19  The central limit theorem states that given a parameter with mean μ and variance σ², the sampling distribution of the mean approaches a normal distribution with mean μ and variance σ²/n  This is true even when the distribution of the parameter is not normal.  The normal distribution is widely used. Part of its appeal is that it is well behaved and mathematically tractable. Central limit theorem

20 Sample variance and standard error  Variance of the sample mean of an SRS of ‘n’ units for a population of size ‘N’:  e = standard error  Measure of sampling error. Depends on 3 factors: ( 1 - n/N ) = Finite Population Correction (fpc) ( 1 - n/N ) = Finite Population Correction (fpc) n = sample size n = sample size Var(X) = Population variance. Unknown, but can be estimated without bias by: Var(X) = Population variance. Unknown, but can be estimated without bias by:

21 Proportions  A proportion P (or prevalence) is equal to the mean of a dummy variable.  In this case Var(P) = P(1-P), and

22  It is not sufficient to simple report the sample proportion obtained by Mr Green in the sample survey, we also need to give an indication of how accurate the estimate is.  Confidence intervals are used to indicate the accuracy of an estimate.  In other words, instead of estimating the parameter of interest by a single value, an interval of likely estimates is given. Confidence intervals

23 Confidence intervals (continued) where: t α = 1.28 for confidence level α = 80% t α = 1.64 for confidence level α = 90% t α = 1.96 for confidence level α = 95% t α = 2.58 for confidence level α = 99%

24 Confidence intervals In a sample of 1,000 electors, 280 of them (28 percent) say they will vote Green. Standard error is 1.42 percent.

25 Confidence intervals 24 25 26 27 28 29 30 31 32 In a sample of 1,000 electors, 280 of them (28 percent) say they will vote Green. Standard error is 1.42 percent. Standard error 95 percent confidence interval: 28 ± 1.42 1.96 99 percent confidence interval: 28 ± 1.42 2.58

26 The required sample size n is determined by The variability of the parameter Var(X) The variability of the parameter Var(X) But we don’t know it!But we don’t know it! The maximum margin of error E we are willing to accept The maximum margin of error E we are willing to accept How confident we want to be in that the error of our estimation will not exceed that maximum How confident we want to be in that the error of our estimation will not exceed that maximum For each confidence level α there is a coefficient t α For each confidence level α there is a coefficient t α The size of the population The size of the population But this is not very important!But this is not very important! For a proportion


Download ppt "Formalizing the Concepts: Simple Random Sampling."

Similar presentations


Ads by Google