Sadeq R Chowdhury JSM 2019, Denver

Slides:



Advertisements
Similar presentations
Multistage Sampling.
Advertisements

Overview of Sampling Methods II
Ex Post Facto Experiment Design Ahmad Alnafoosi CSC 426 Week 6.
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
MISUNDERSTOOD AND MISUSED
Dr. Chris L. S. Coryn Spring 2012
Why sample? Diversity in populations Practicality and cost.
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 10 Introduction to Estimation.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
Measurement Error.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 10 Introduction to Estimation.
7-1 Chapter Seven SAMPLING DESIGN. 7-2 Selection of Elements Population Element the individual subject on which the measurement is taken; e.g., the population.
7.1Sampling Methods 7.2Introduction to Sampling Distribution 7.0 Sampling and Sampling Distribution.
Data Collection and Sampling
Statistical Methods Introduction to Estimation noha hussein elkhidir16/04/35.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Chapter 10 Introduction to Estimation Sir Naseer Shahzada.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad.
1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Chapter Eleven Sampling: Design and Procedures Copyright © 2010 Pearson Education, Inc
Introduction to Survey Sampling
Sampling. Census and Sample (defined) A census is based on every member of the population of interest in a research project A sample is a subset of the.
1. 2 DRAWING SIMPLE RANDOM SAMPLING 1.Use random # table 2.Assign each element a # 3.Use random # table to select elements in a sample.
Sampling Dr Hidayathulla Shaikh. Contents At the end of lecture student should know  Why sampling is done  Terminologies involved  Different Sampling.
Pairwise comparisons: Confidence intervals Multiple comparisons Marina Bogomolov and Gili Baumer.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Sampling: Design and Procedures
Types of Samples Dr. Sa’ed H. Zyoud.
This will help you understand the limitations of the data and the uses to which it can be put (and the confidence with which you can put it to those.
Dr. Unnikrishnan P.C. Professor, EEE
Making inferences from collected data involve two possible tasks:
Sampling Why use sampling? Terms and definitions
Statistical Core Didactic
Graduate School of Business Leadership
Sampling And Sampling Methods.
Organizing national surveys
SAMPLING (Zikmund, Chapter 12.
Chapter 8: Inference for Proportions
SAMPLE DESIGN.
Meeting-6 SAMPLING DESIGN
Sampling: Design and Procedures
Sampling with unequal probabilities
Complex Surveys
Power, Sample Size, & Effect Size:
Estimation of Sampling Errors, CV, Confidence Intervals
Sampling: Design and Procedures
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Cluster Sampling STAT262.
I. Statistical Tests: Why do we use them? What do they involve?
Sampling and Sample Size Calculations
Daniela Stan Raicu School of CTI, DePaul University
SAMPLING (Zikmund, Chapter 12).
Introduction to Estimation
Sampling Design Basic concept
Sampling.
Allocation of Support Department Costs, Common Costs, and Revenues
Sampling Chapter 6.
EQ: What is a “random sample”?
Presentation transcript:

Sadeq R Chowdhury JSM 2019, Denver Comparing Alternative Estimation Methods When Using Multi-hit Approach to PSU Selection Sadeq R Chowdhury JSM 2019, Denver

Disclaimer The views expressed in this presentation are those of the authors and no official endorsement by the Department of Health and Human Services or the Agency for Healthcare Research and Quality is intended or should be inferred.

Outline Background Multi-stage Sampling PSU/Cluster selection – Usual Method vs. Multi-hit Approach Multi-Hit Approach Calculating Selection Probabilities Comparison of Alternative Estimators Conclusion

Background Medical Expenditure Panel Survey (MEPS) is a subsample of National Health Interview Survey (NHIS) Therefore, both surveys based on the same design – a multi-stage area sample design until 2016 2016 NHIS Redesign Utilized USPS listing of all households in a PSU instead of traditional listings within selected segments Clusters of households within PSUs were selected directly from the PSU-wide listing of households

Background (cont.) Number of clusters to be selected from a PSU is based on a multi-hit approach All clusters have equal size and equal probability of selection as if in a single-stage cluster sample design This presentation compares alternative methods of calculating selection probabilities and sample weights for this design

Usual Multi-stage Sampling Ultimate Sampling Units (USUs) are selected in multiple stages for cost and operational convenience First-stage units are called Primary Sampling Units (PSUs) USUs are selected in one or more stages within selected PSUs (e.g., segments, households, persons) Overall selection probability of a USU is multiplicative of all earlier stages of selection e.g., 𝑃 𝑖𝑗𝑘 = 𝑃 𝑖 𝑃 𝑗|𝑖 𝑃 𝑘|𝑖𝑗 Usually, design ensures equal overall probability of USU Selection across all PSUs

Usual PSU Selection Procedure A systematic PPS sampling is used with measure of size (MOS)= # of USUs in a PSU Skip Interval, 𝑆𝐼= 𝑀 0 𝑛 , where 𝑀 0 = 𝑖=1 𝑁 𝑀 𝑖 with 𝑀 𝑖 = MOS of PSU 𝑖 and 𝑛 = # of PSUs to be selected PSUs with 𝑀 𝑖 ≥𝑆𝐼 is selected with certainty (selection prob = 1.0) called certainty or self-representing (SR) PSUs PSUs with 𝑀 𝑖 <𝑆𝐼 are sampled with probability < 1.0 and are called non-certainty or NSR PSUs Certainty PSUs are identified first and then Non-certainty PSUs are sampled

Usual Procedure of Identifying Certainty PSUs (Method 0) Usual Iterative Method Iteration 1: PSUs with 𝑀 𝑖 ≥ (𝑆𝐼 1 = 𝑀 0 𝑛 ) Iteration 2: Recalculate 𝑆𝐼 2 = 𝑀 0 − 𝑖∈𝑐1 𝑀 𝑖 𝑛− 𝑛 𝑐1 and select PSUs with 𝑀 𝑖 ≥ 𝑆𝐼 2 Continue iteration until no more certainty PSUs Select a sample of NSR PSUs to represent the rest of the population. 𝑆𝐼 𝑛𝑐 = 𝑀 0 − 𝑖∈𝑐 𝑀 𝑖 𝑛− 𝑛 𝑐 = 𝑀 𝑛𝑐 / 𝑛 𝑛𝑐

Multi-Hit Approach to PSU or Cluster Selection Certainty PSUs are not identified up front A systematic sampling skip interval (𝑆𝐼= 𝑀 0 𝑛 ) is calculated only once This skip interval is applied through all PSUs and the PSUs with 𝑀𝑂𝑆≥𝑆𝐼 receive at least one hit. The PSUs with 𝑀𝑂𝑆<𝑆𝐼 receive either zero or one hit based on the random process Number of clusters selected from a PSU is equal to the number of hits a PSU receives Usually equal size clusters are selected

Method A Multi-hit Selection Probability A cluster or hit represents the population covered by the skip interval, i.e., 𝑃 𝑖𝑗 = 𝑚 𝑆𝐼= 𝑛 𝑚 𝑀 0 , with 𝑆𝐼= 𝑀 0 𝑛 No designation of any certainty or non-certainty PSU or explicit stratum for a certainty PSU All clusters are selected with equal probability as if in a single-stage selection of equal size clusters A cluster represents a whole or part of a PSU or more than one PSU depending on 𝑆𝐼 and 𝑀 𝑖

Method A (cont.) Multi-hit Selection Probability For example, if 𝑆𝐼=20𝐾 and 𝑀 𝑖 =25𝐾 for PSU i then it can have 1 or 2 hits or clusters If only 1 cluster selected - it will represent SI=20K units from the current PSU and the cluster selected from the next PSU will represent the remaining 5𝐾 units from this PSU and 15𝐾 units from the next PSU If 2 clusters selected – clusters will represent the whole current PSU (25𝐾 units) plus 15𝐾 units from the next PSU

Method B Multi-hit Selection Probability PSUs with 𝑀𝑂𝑆≥𝑆𝐼 receive at least one hit and are treated as certainty with selection prob=1.0 A certainty PSU is treated like a separate stratum The selection prob of an USU depends on the size of the PSU and the number of hits the PSU receives. 𝑃 𝑖𝑗 =1x 𝑘 𝑚 𝑀 𝑖 in a certainty PSU with 𝑘 hits While in all non-certainty PSUs, the selection prob is the same 𝑃 𝑖𝑗 = 𝑛 𝑀 𝑖 𝑀 0 𝑚 𝑀 𝑖 = 𝑚 𝑆𝐼 in NSR PSUs

Method B (cont.) Multi-hit Selection Probability Selection probability is the same ( 𝑚 𝑆𝐼 ) in all non-certainty PSUs under both Methods A and B; difference only in certainty PSUs Using the same example, if 𝑆𝐼=20𝐾 and 𝑀 𝑖 =25𝐾 then the PSU can have either 1 or 2 hits and the selection prob will depend on # of hits the PSU receives 𝑃 𝑖𝑗 = 𝑚 𝑀 𝑖 = 𝑚 25𝐾 if one hit or 𝑃 𝑖𝑗 = 2 𝑚 𝑀 𝑖 =2 𝑚 25𝐾 if two hits Selection probabilities are random here because the # of hits (i.e., 1 or 2) the PSU receives is random On expectation, 𝑃 𝑖𝑗 =.75 𝑚 25𝐾 +.25 2 𝑚 25𝐾 = 𝑚 𝑆𝐼 (Method A)

PSU Selection Probability Method 0 vs Method B PSU Type Method PSU Selection Prob # of NSR PSUs SR Method 0 or Method B 1 NSR PSU Method 0 𝑛 𝑛𝑐 𝑀 𝑖 𝑀 𝑛𝑐 = 𝑖∈𝑛𝑐 𝑀 𝑖 𝑛 𝑛𝑐 NSR PSUs Method B 𝑛 𝑀 𝑖 𝑀 0 = 𝑖 𝑀 𝑖 𝑛 PSUs Method A PSU, SR/NSR not relevant

An Example of Multi-Hit Selection Procedure

Comparison of Methods A & B for Estimating Known ‘Total MOS’

When Actual MOS Differs From the Design MOS

Comparison of Methods A & B When Actual MOS Differs from Design MOS

Summary Multi-Hit Estimation Method A No distinction between SR or NSR PSUs No separate stratum for a SR PSU Similar to a Single-stage Selection of clusters All clusters have equal probability of selection and equal weight Uses expected selection probability

Summary Multi-Hit Estimation Method B Identifies and treats SR/NSR PSUs differently Each SR PSU is treated as an explicit stratum Similar to a two-stage design Selection Probability is the same across all NSR PSUs but varies among SR PSUs Selection probability is random, depends on each random draw

Conclusion Both Methods A and B produce unbiased estimates However, Method B is less efficient (i.e., higher variance of estimate) than Method A because Method B ignores variation of selection probabilities across all possibilities of selections i.e., assumes fixed uses selection probability based on realized sample that is random; does not take expectation over randomness makes selection probability vary among SR PSUs, which increases variation in weights makes it a two-stage design subsequently, after selecting the sample

Thank You! Sadeq.Chowdhury@ahrq.hhs.gov