A new sampling method: stratified sampling

Slides:



Advertisements
Similar presentations
Introduction Simple Random Sampling Stratified Random Sampling
Advertisements

“Students” t-test.
Estimation in Sampling
Economics 105: Statistics Review #1 due next Tuesday in class Go over GH 8 No GH’s due until next Thur! GH 9 and 10 due next Thur. Do go to lab this week.
Chapter 5 Stratified Random Sampling n Advantages of stratified random sampling n How to select stratified random sample n Estimating population mean and.
Sampling with unequal probabilities STAT262. Introduction In the sampling schemes we studied – SRS: take an SRS from all the units in a population – Stratified.
Ch 4: Stratified Random Sampling (STS)
QBM117 Business Statistics Statistical Inference Sampling 1.
Chapter 8 Estimation: Additional Topics
Dr. Chris L. S. Coryn Spring 2012
Who and How And How to Mess It up
Chapter 7 Sampling and Sampling Distributions
Sampling.
Chapter 10 Sampling and Sampling Distributions
Why sample? Diversity in populations Practicality and cost.
Chapter 11 Sampling Design. Chapter 11 Sampling Design.
Fundamentals of Sampling Method
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
STAT262: Lecture 5 (Ratio estimation)
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Sampling Design.
Formalizing the Concepts: Simple Random Sampling.
5.10: Stratification after Selection of Sample – Post Stratification n Situations can arise in which we cannot place sampling units into their correct.
Sampling Designs Avery and Burkhart, Chapter 3 Source: J. Hollenbeck.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Key terms in Sampling Sample: A fraction or portion of the population of interest e.g. consumers, brands, companies, products, etc Population: All the.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Sampling: Theory and Methods
Chapter 7 Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Estimates and Sample Sizes Lecture – 7.4
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
PARAMETRIC STATISTICAL INFERENCE
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Design.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 1 Sample Design Produced in Collaboration between World Bank Institute and the Development.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
1 Chapter 2: Sampling and Surveys. 2 Random Sampling Exercise Choose a sample of n=5 from our class, noting the proportion of females in your sample.
Introduction to Survey Sampling
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Chapter 13: Inferences about Comparing Two Populations Lecture 8b Date: 15 th November 2015 Instructor: Naveen Abedin.
Basic Business Statistics
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
Probability Sampling. Simple Random Sample (SRS) Stratified Random Sampling Cluster Sampling The only way to ensure a representative sample is to obtain.
Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Variability. The differences between individuals in a population Measured by calculations such as Standard Error, Confidence Interval and Sampling Error.
Variability.
Sampling Why use sampling? Terms and definitions
Meeting-6 SAMPLING DESIGN
Stratified Sampling STAT262.
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Estimating population size and a ratio
Presentation transcript:

A new sampling method: stratified sampling In stratified sampling, we conduct SRS in each stratum Outline Definition and motivation Statistical inference (theory of stratified sampling) Advantages of stratified sampling Sample size calculation

Stratified sampling: definition and motivation A motivating example: average number of words in save messages of people in this room What is stratified sampling? Stratify: make layers Strata: subpopulations Strata do not overlap Each sampling unit belongs to exactly one stratum Strata constitute the whole population

Why do we use stratified sampling? Be protected from obtaining a really bad sample. Example Population size is N=500 (250 women and 250 men) SRS of size n=50 It is possible to obtain a sample with no or a few males Pr(less than or equal to 15 men in an SRS)=0.003 Pr(less than or equal to 20 men in an SRS)=0.10 In stratified sampling, we can sample 25 men and 25 women

Why do we use stratified sampling? Stratified sampling allows us to compare subgroups Convenient, reduce cost, easy to sample More precise. See the following example

Total number of farm acres (3078 counties) SRS of 300 counties from the Census of Agriculture Estimate: , standard error: Stratified sampling: about 10% stratum (region)

Total number of farm acres (3078 counties) Estimate: Standard error:

Theory of stratified sampling

Notation for Stratification: Population

Notation for Stratification: Sample

Stratified sampling: estimation

Statistical Properties: Bias and Variance

Variance Estimates for stratified samples

Confidence intervals for stratified samples Some books use t distribution with n-H degrees of freedom

Sampling probabilities and weights In a population with 1600 men and 400 women and the stratified sample design specifies sampling 200 men and 200 women, Each man in the sample has weight 8 and woman has weight 2 Each woman in the sample represents herself and 1 other woman not selected Each man represents himself and 7 other men not in the sample

Sampling probabilities and weights The sampling probability for the jth unit in the hth stratum is Sampling weight: The sum of sampling weight is N

Sampling probabilities and weights

Sampling probabilities and weights example

Sampling probabilities and weights in proportional allocation In proportional allocation, the number of sampled units in each stratum is proportional to the size of the stratum, i.e., Every unit in the sample has the same weight and represents the same number of units in the population. The sample is called self-weighting

Sampling probabilities and weights in proportional allocation Sampling probability for all units is about 10% All the weights are the same: 10

An example of stratified sampling

Observed data

Spreadsheet for calculations in the example

Stratified sampling for proportions

Allocating observations to strata In the theoretical derivation and examples of stratified sampling, we assume that someone has designed a survey. Survey design is the most important part of using a survey in research If we use a badly designed survey, there is no way that we can get the correct result The problem of allocating observations to strata concerns how should one determines the sample size /relative sample of each stratum.

Proportional Allocation In proportional allocation the number of sampled units in each stratum is proportional to the size of the stratum The probability of selection is the same for all strata (= ) for all strata Every unit in the sample has the same weight (=N/n), represents the same number of units in the population The sample is a self-weighting sample

Stratified sampling (with proportional allocation) vs SRS What is the benefit of using stratified sampling (with proportional allocation) Under what conditions is stratified sampling (with proportional allocation) better than SRS? To compare the two sampling methods, we need to compare between-strata and within-strata variances

Analysis of Variance (ANOVA) for the population

Stratified sampling (with proportional allocation) vs SRS

Stratified sampling (with proportional allocation) vs SRS

Stratified sampling (with proportional allocation) vs SRS The situation when stratified sampling with proportional allocation give a larger variance than SRS rarely happens when the strata sizes are large. The more unequal the stratum means, the more precision we will gain by using stratified sampling with proportional allocation

Optimal Allocation Stratified sampling with proportional allocation is easy to conduct It is more precise than SRS in most situations But it is not necessarily the most efficient stratified sampling This is especially true when the variances vary substantially from stratum to stratum

Optimal allocation The goal of optimal allocation is to gain the most information for the least cost. We can assume that the total cost is fixed. Given that, we want to minimize the variance Different types of cost Total cost: C Overhead cost such as maintaining an office: C0 The cost of taking an observation in stratum h: Ch

Optimal allocation Recall that Want to minimize subject to

Optimal allocation Introducing a Lagrange multiplier λ, we will need to minimize Take partial derivative and set it to zero

Optimal allocation

Optimal allocation We need to find the value of n Recall that the total cost C is fixed, i.e.,

Optimal allocation Combine the results, we have

Optimal allocation: two special situations

An example

Optimal allocation: two special situations

Optimal allocation for fixed variance (v) One may want to minimize cost for fixed variance Mathematically, we want to One can use Lagrange multiplier to show that Want to minimize subject to

Some practical issues Stratified sampling often gives higher precision than SRS But how to define strata? Stratification is most efficient when stratum means differ widely

Define strata Try to find some variables closely related to y E.g., For farm income, use the size of a farm as a stratification variable For estimating total business expenditures on advertising, stratify by number of employees or by the type of product Get information from experts, old data, preliminary data, etc

Effects of unknown strata sizes and variances Unknown strata sizes and variances cause bias One can use a pilot study to obtain good estimates of strata sizes and variances

Summary Stratified sampling almost always gives higher precision than SRS Stratification adds complexity to survey. E.g., when strata sizes and variances are unknown In many situations, the potential gain from stratification are large enough to justify the effects of stratifying population and the expenses of conducting pilot studies

Poststratification Suppose a sampling frame lists all households in an area You would like to estimate the average amount spent on food in a month One desirable stratification variable is household size Large households are expected to have higher food bills The distribution of household size is known (from U.S. census data)

An example of poststratification The distribution of household size from U.S. census

An example of poststratification The sampling frame does not include information on household size – we cannot conduct a stratified sampling based on household size We take an SRS and record The amount spent on food The household size If n (of the SRS) is large enough, we expect about 26% 1-person households and about 31% two-person households, and so on

An example of poststratification We can use the methods of stratified sampling to estimate the average amount spent on food for each category of household sizes After the observations are taken, we can form a “stratified” estimate of the population mean

An example of poststratification

An example of poststratification Discuss about the example

An example of poststratification Poststratification can be dangerous You can obtain arbitrarily small variances if you choose the strata after seeing data Poststratificaiton is most often used to correct for the effects of differential nonresponse in the poststrata (chapter 8)

A new sampling method Motivating example Want to study the average amount water used by per person How would you design a survey?