Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sadeq R Chowdhury JSM 2019, Denver

Similar presentations


Presentation on theme: "Sadeq R Chowdhury JSM 2019, Denver"— Presentation transcript:

1 Sadeq R Chowdhury JSM 2019, Denver
Comparing Alternative Estimation Methods When Using Multi-hit Approach to PSU Selection Sadeq R Chowdhury JSM 2019, Denver

2 Disclaimer The views expressed in this presentation are those of the authors and no official endorsement by the Department of Health and Human Services or the Agency for Healthcare Research and Quality is intended or should be inferred.

3 Outline Background Multi-stage Sampling PSU/Cluster selection
– Usual Method vs. Multi-hit Approach Multi-Hit Approach Calculating Selection Probabilities Comparison of Alternative Estimators Conclusion

4 Background Medical Expenditure Panel Survey (MEPS) is a subsample of National Health Interview Survey (NHIS) Therefore, both surveys based on the same design – a multi-stage area sample design until 2016 2016 NHIS Redesign Utilized USPS listing of all households in a PSU instead of traditional listings within selected segments Clusters of households within PSUs were selected directly from the PSU-wide listing of households

5 Background (cont.) Number of clusters to be selected from a PSU is based on a multi-hit approach All clusters have equal size and equal probability of selection as if in a single-stage cluster sample design This presentation compares alternative methods of calculating selection probabilities and sample weights for this design

6 Usual Multi-stage Sampling
Ultimate Sampling Units (USUs) are selected in multiple stages for cost and operational convenience First-stage units are called Primary Sampling Units (PSUs) USUs are selected in one or more stages within selected PSUs (e.g., segments, households, persons) Overall selection probability of a USU is multiplicative of all earlier stages of selection e.g., 𝑃 𝑖𝑗𝑘 = 𝑃 𝑖 𝑃 𝑗|𝑖 𝑃 𝑘|𝑖𝑗 Usually, design ensures equal overall probability of USU Selection across all PSUs

7 Usual PSU Selection Procedure
A systematic PPS sampling is used with measure of size (MOS)= # of USUs in a PSU Skip Interval, 𝑆𝐼= 𝑀 0 𝑛 , where 𝑀 0 = 𝑖=1 𝑁 𝑀 𝑖 with 𝑀 𝑖 = MOS of PSU 𝑖 and 𝑛 = # of PSUs to be selected PSUs with 𝑀 𝑖 ≥𝑆𝐼 is selected with certainty (selection prob = 1.0) called certainty or self-representing (SR) PSUs PSUs with 𝑀 𝑖 <𝑆𝐼 are sampled with probability < and are called non-certainty or NSR PSUs Certainty PSUs are identified first and then Non-certainty PSUs are sampled

8 Usual Procedure of Identifying Certainty PSUs (Method 0)
Usual Iterative Method Iteration 1: PSUs with 𝑀 𝑖 ≥ (𝑆𝐼 1 = 𝑀 0 𝑛 ) Iteration 2: Recalculate 𝑆𝐼 2 = 𝑀 0 − 𝑖∈𝑐1 𝑀 𝑖 𝑛− 𝑛 𝑐1 and select PSUs with 𝑀 𝑖 ≥ 𝑆𝐼 2 Continue iteration until no more certainty PSUs Select a sample of NSR PSUs to represent the rest of the population. 𝑆𝐼 𝑛𝑐 = 𝑀 0 − 𝑖∈𝑐 𝑀 𝑖 𝑛− 𝑛 𝑐 = 𝑀 𝑛𝑐 / 𝑛 𝑛𝑐

9 Multi-Hit Approach to PSU or Cluster Selection
Certainty PSUs are not identified up front A systematic sampling skip interval (𝑆𝐼= 𝑀 0 𝑛 ) is calculated only once This skip interval is applied through all PSUs and the PSUs with 𝑀𝑂𝑆≥𝑆𝐼 receive at least one hit. The PSUs with 𝑀𝑂𝑆<𝑆𝐼 receive either zero or one hit based on the random process Number of clusters selected from a PSU is equal to the number of hits a PSU receives Usually equal size clusters are selected

10 Method A Multi-hit Selection Probability
A cluster or hit represents the population covered by the skip interval, i.e., 𝑃 𝑖𝑗 = 𝑚 𝑆𝐼= 𝑛 𝑚 𝑀 0 , with 𝑆𝐼= 𝑀 0 𝑛 No designation of any certainty or non-certainty PSU or explicit stratum for a certainty PSU All clusters are selected with equal probability as if in a single-stage selection of equal size clusters A cluster represents a whole or part of a PSU or more than one PSU depending on 𝑆𝐼 and 𝑀 𝑖

11 Method A (cont.) Multi-hit Selection Probability
For example, if 𝑆𝐼=20𝐾 and 𝑀 𝑖 =25𝐾 for PSU i then it can have 1 or 2 hits or clusters If only 1 cluster selected - it will represent SI=20K units from the current PSU and the cluster selected from the next PSU will represent the remaining 5𝐾 units from this PSU and 15𝐾 units from the next PSU If 2 clusters selected – clusters will represent the whole current PSU (25𝐾 units) plus 15𝐾 units from the next PSU

12 Method B Multi-hit Selection Probability
PSUs with 𝑀𝑂𝑆≥𝑆𝐼 receive at least one hit and are treated as certainty with selection prob=1.0 A certainty PSU is treated like a separate stratum The selection prob of an USU depends on the size of the PSU and the number of hits the PSU receives. 𝑃 𝑖𝑗 =1x 𝑘 𝑚 𝑀 𝑖 in a certainty PSU with 𝑘 hits While in all non-certainty PSUs, the selection prob is the same 𝑃 𝑖𝑗 = 𝑛 𝑀 𝑖 𝑀 0 𝑚 𝑀 𝑖 = 𝑚 𝑆𝐼 in NSR PSUs

13 Method B (cont.) Multi-hit Selection Probability
Selection probability is the same ( 𝑚 𝑆𝐼 ) in all non-certainty PSUs under both Methods A and B; difference only in certainty PSUs Using the same example, if 𝑆𝐼=20𝐾 and 𝑀 𝑖 =25𝐾 then the PSU can have either 1 or 2 hits and the selection prob will depend on # of hits the PSU receives 𝑃 𝑖𝑗 = 𝑚 𝑀 𝑖 = 𝑚 25𝐾 if one hit or 𝑃 𝑖𝑗 = 2 𝑚 𝑀 𝑖 =2 𝑚 25𝐾 if two hits Selection probabilities are random here because the # of hits (i.e., 1 or 2) the PSU receives is random On expectation, 𝑃 𝑖𝑗 =.75 𝑚 25𝐾 𝑚 25𝐾 = 𝑚 𝑆𝐼 (Method A)

14 PSU Selection Probability Method 0 vs Method B
PSU Type Method PSU Selection Prob # of NSR PSUs SR Method 0 or Method B 1 NSR PSU Method 0 𝑛 𝑛𝑐 𝑀 𝑖 𝑀 𝑛𝑐 = 𝑖∈𝑛𝑐 𝑀 𝑖 𝑛 𝑛𝑐 NSR PSUs Method B 𝑛 𝑀 𝑖 𝑀 0 = 𝑖 𝑀 𝑖 𝑛 PSUs Method A PSU, SR/NSR not relevant

15 An Example of Multi-Hit Selection Procedure

16 Comparison of Methods A & B for Estimating Known ‘Total MOS’

17 When Actual MOS Differs From the Design MOS

18 Comparison of Methods A & B When Actual MOS Differs from Design MOS

19 Summary Multi-Hit Estimation Method A
No distinction between SR or NSR PSUs No separate stratum for a SR PSU Similar to a Single-stage Selection of clusters All clusters have equal probability of selection and equal weight Uses expected selection probability

20 Summary Multi-Hit Estimation Method B
Identifies and treats SR/NSR PSUs differently Each SR PSU is treated as an explicit stratum Similar to a two-stage design Selection Probability is the same across all NSR PSUs but varies among SR PSUs Selection probability is random, depends on each random draw

21 Conclusion Both Methods A and B produce unbiased estimates
However, Method B is less efficient (i.e., higher variance of estimate) than Method A because Method B ignores variation of selection probabilities across all possibilities of selections i.e., assumes fixed uses selection probability based on realized sample that is random; does not take expectation over randomness makes selection probability vary among SR PSUs, which increases variation in weights makes it a two-stage design subsequently, after selecting the sample

22 Thank You!


Download ppt "Sadeq R Chowdhury JSM 2019, Denver"

Similar presentations


Ads by Google