Presentation on theme: "Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling."— Presentation transcript:
Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling
7.1: Introduction For small popn, a census study are used because data can be gathered on all. For large popn, sample surveys will be used to obtain information from a small (but carefully chosen) sample of the popn. The sample should reflect the characteristics of the popn from which it is drawn. Sampling methods are classified as either probabilistic or non-probabilistic in nature.
Sampling methods: Probability Sampling: Random Sampling Systematic Sampling Stratified Sampling NonProbabilitySampling ConvenienceSampling Judgment Sampling Quota Sampling Snowball Sampling
The winner is: Probability Sampling. In non-probability sampling, members are selected from the popn in some non- random manner and the sampling error (=degree to which a sample might differ from the popn) is unknown. In probability sampling, each member of the popn has a specified probability of being included in the sample. Its advantage is that sampling error can be calculated.
7.2: Popn parameters Definition: Parameters are those numerical characteristics of the popn that we will estimate from a sample. Notations:
Popn mean, total, variance: Popn mean: Popn total: Popn variance: its square root is the StdDev
7.3: Simple Random Sampling s.r.s. The most elementary form of sampling is s.r.s. where each member of the popn has an equal and known chance of being selected at most once. There are possible samples of size n taken without replacement. In this section, we will derive some statistical properties of the sample mean.
7.3.0: The Sample Mean: The sample mean estimates the popn mean where so that will estimate the popn total
7.3.1: Expectation & Variance of the Sample Mean: Theorem A (UNBIASEDNESS) : s.r.s. under s.r.s., s.r.s. Theorem B: under s.r.s., Recall: The variance of in sampling without replacement differs from that in sampling with replacement by the factor which is called the finite population correction. The ratio is called the sampling fraction.
7.3.2: Estimation of the Population Variance: s.r.s. Theorem A: under s.r.s., where Corollary A: An unbiased estimator of is given by where
7.3.3: Normal approximation to the sampling distn of the sample mean We will be using the CLT (Central Limit Theorem, see Section 5.3) in order to find the probabilistic bounds for the estimation error. Application 1: the probability that the error made in estimating by is using the CLT. Application 2: a CI (Confidence Interval) for the popn mean is given by using the CLT.
7.4: Estimating a ratio: Ratio arises frequently in Survey Sampling. If a bivariate sample is drawn, then the ratio is estimated by. We wish to derive E(R) and Var(R) using approximation methods seen in Section 4.6 because R is a nonlinear function.
7.4: Estimating a ratio (contd) Theorem A: s.r.s. With s.r.s., the approximate variance of R is Since the population correlation popn is then
7.4: Estimating a ratio (contd) Theorem B: s.r.s. With s.r.s., the approximate expectation of R is
Standard Error estimate of R: The estimate variance of R is where and the popn covariance is estimated by
Confidence Interval for r: An approximate CI (Confidence Interval) for the ratio of interest r is given by
7.5: Stratified Random Sampling: (S.R.S.) 7.5.1: Introduction (S.R.S.) The popn is partitioned into sub-pops or strata (stratum, singular) that are then independently sampled and are combined to estimate popn parameters. A stratum is a subset of the population that shares at least one common characteristic Example: males & females; age groups;…
Why is Stratified Sampling superior to Simple Random Sampling? S.R.S.S.R.S. reduces the sampling error S.R.S. s.r.s.S.R.S. guarantees a prescribed number of observations from each stratum while s.r.s. cant S.R.S. s.r.s.The mean of a S.R.S. can be considerably more precise than the mean of a s.r.s., if the popn members within each stratum are relatively homogeneous and if there is enough variation between strata.
7.5.2: Properties of Stratified Estimates: Notation: Let be the total popn size if denote the popn sizes in the L strata. The overall popn mean is a weighted average of the popn means of the L strata, where denotes the fraction of the popn in the stratum.
7.5.2: Properties of Stratified Estimates: (contd) Stratified sampling requires two steps: Identify the relevant strata in the popn s.r.s.Use s.r.s. to get subject from each stratum s.r.s. Within each stratum, a s.r.s. of size is taken to obtain the sample mean in the stratum will be denoted by where denotes the observation in the stratum.
7.5.2: Properties of Stratified Estimates: (contd) Theorem A: The stratified estimate,, of the overall popn mean is UNBIASED. Since we assume that the samples from different strata are independent of one another and that within each stratum a s.r.s. is taken, then the variance of can be easily calculated in:Since we assume that the samples from different strata are independent of one another and that within each stratum a s.r.s. is taken, then the variance of can be easily calculated in: Theorem B: The variance of the stratified sample is
Neglecting / Ignoring the finite population correction: Approximation: If the sampling fractions within all strata were small, Theorem B will then reduce to:
Expectation and Variance of the stratified estimate of the popn total: This is a corollary of Theorems A & B. Practice with examples A & B in the textbook on pages to get healthy with these calculations.
7.5.3: Methods of allocation: For small sampling fractions within strata i.e. when neglecting/ignoring the finite popn correction, Question: How to choose to minimize subject to the constraint when resources of a survey allowed only a total of n units to be sampled? Note: We could include finite popn corrections but the results will be more complicated. Try it!
7.5.3 a: Neyman allocation Theorem A: The samples sizes that minimize subject to the constraint are given by Corollary A: stratified estimate & optimal allocations
7.5.3 b: Proportional allocation If a survey measures several attributes for each popn member, it will be difficult to find an allocation that is simultaneously optimal for each of those variables. Using the same sampling fraction in each stratum will provide a simple and popular alternative method of allocation.
7.5.3b:Proportional allocation (cont Theorem B: With stratified sampling based on proportional allocation, ignoring the finite popn correction, Theorem C: With stratified sampling based on both allocation methods, ignoring the finite popn correction,
7.6:Conclusions s.r.s. A mathematical model for Survey Sampling was built using s.r.s. and probabilistic error bounds for the estimates derived. probability non-probability The theory and techniques of survey probability sampling include Systematic Sampling, Cluster Sampling, etc…as well as non-probability sampling methods such as Quota Sampling, Snowball Sampling,…