Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD)

Similar presentations


Presentation on theme: "Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD)"— Presentation transcript:

1 Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD)

2 Foundation of the Statistics
Definition of the population is essential in relation to interpretation of the statistics If we do not have a firm grip on the population, everything else is unimportant! SBR is central in this respect Updated by administrative sources Updated by information from different Statistical Divisions a benefit for all Statistical Divisions

3 Definition of the Population (1)
Population of interest is the collection of objects in which we are interested Example: All businesses in Ukraine Target population is the section of the population of interest that we, for practical reasons, must confine ourselves to observe Example: All businesses with at least 10 employees Sampling frame is the data representation of the target population available to us – it is from here that the sample is drawn Example: Extracts from SBR

4 Definition of the Population (2)
Sample Sampling frame Target population Population of interest

5 Frame Imperfections The difference between the target population and the sampling frame is due to the fact, that our registers are not perfect Over-coverage: Businesses which are included in the sampling frame, but ought not to be included Can be discovered during data collection Example: The business went bankrupt long before the starting date of the reference period Under-coverage: Businesses which ought to be included in the sample frame, but are not included Can be discovered, if we have knowledge of the area via other sources

6 Estimation based on an updated population
Design weights are sacred Selection probabilities are sacred The handling of stratum changes should be conducted by calibration and domain-estimation Estimation may account for cut-off sampling

7 Dynamic Frame Population
Current version Historic version Time t+2 Time t+1 Time t t t+1 t+2

8 Frozen Frame Population
Current version Historic version Frozen version Time t+4 Time t+2 Time t+3 Time t Time t+1 t t+1 t+2

9 Population at Estimation stage
Current version Historic version Frozen version Sample Estimation of structural survey Estimation of short-term survey t t+1 t+2

10 SBS statistics (all kind) (1)
Purpose: Give information about the structure Be able to compare across statistics When: Year t (a period) or Ultimo t (a point in time) Based on: A survey 100 % Big enterprises 50 % ? Medium sized enterprises 25 % ? Small enterprises 0 % ? Micro enterprises divided eventually into sub-strata The survey is drawn on the basis of a frozen SBR version 15th Nov year t The survey is carried through during e.g. Marts-June t+1

11 SBS statistics (2) New information on Year t requires updating of frozen SBR What ? All active enterprises and other units, as e.g. LKAU, (during the year t) have to be in the frozen version All relevant changes/corrections (and that is changes related to the Year t and not t+1) have to be in the frozen version but be aware of eventually bias – not only information from surveys has to be taken in what could be the sources for updating SBR? for the year t? When ? Before the first SBS statistic is produced Hopefully it is also when the information is available Hvordan skal det fortolkes, at ikke kun information fra surveys skal indarbejdes.

12 SBS statistics (3) And now to the Enumeration
The sample was drawn 15th Nov t The new frozen version is formed ddmmyy year t+1? Principle: At the estimation stage we discover, that a unit selected in stratum ha with π = 0.1 has moved to stratum hb We then have to believe that 9 other (unobserved) units from ha have made a similar move Instead of changing the selection probabilities, the combination Activity*Size are regarded as domains, and calibration is conducted on the basis of these new domains

13 SBS statistics (4) And what does that mean?
(The table has been removed because it is not as simple as it was shown Regression analysis has to be used What is important is to know about the population at the time for enumeration! See theory!!)

14 SBS statistics (5) A few names:
Horwitz-Thompson estimat or pi-expansion the sum of design-weights over the sample within a stratum has to sum to the size of the stratum Calibration can be implemented in the form of regression estimator SD uses SCB CLAN survey (a collection of Swedish macros to SAS - but other possibilities exist, e.g. package Survey to R by Thomas Lumley google: "regression estimator sampling", "model assisted survey sampling" or "SCB CLAN survey"

15 SBS statistics (6) Problems
How do you get to know the ‘correct’ population when the frozen version 2 is formed? How do you distribute between strata? But it is risky only to include information from surveys New units should not be included in the sample Deceased units has to be placed in the stratum for deceased units so they get a weight, but it could be tricky to estimate the size (and depends whether the information is from the survey and not from the population (frozen version)

16 STS statistics (1) Purpose:
Give information about development When: Quarter x Year t+1 (a period) or Ultimo quarter x Year t+1 (a point in time) Based on: A survey 100 % Big enterprises, 50 % ? Medium sized enterprises, 25 % ? Small enterprises and 0 % ? Micro enterprises divided into sub-strata The survey is drawn on the basis of a frozen SBR version 15th Nov year t The survey is carried through during April t+1, July t+1, October t+1 and January t+2

17 STS statistics (2) What is the problem?
What about new enterprises? In year t (from 15th Nov to 31st Dec) What about any change from 31st Dec year t and to April, July, … t+1(2)?

18 STS statistics (2) What is the solution? Two possibilities
keep the frozen version and look at changes Disadvantage: this does not take into account new enterprises Advantage: easy make new frozen versions for each quarter (or even use the actual version of SBR*) and continue as for SBS Disadvantages: time consuming what is the sources for producing new versions and to this those mentioned for SBS Advantage: More correct description of the development Either possibilities makes it possible to compare SBS and STS * But it is important to know the whole population and be able to distribute to strata

19 Frozen versions Statistics Denmark
SBS year t Version 1 (Temporary: t+1 5th Match (Turnover/Employees 15th Match) Version 2 (Temporary: t+1 5th Sept. (Turnover/Employees 15th Sept.)) Version 3 (Final: t+1 5th Dec. (Turnover/Employees 15th Dec.)) STS 1st quarter year t+1 Version 1 (Temporary: t+1 5th May (Turnover/Employees 15th May) Version 2 (Final: t+1 5th Aug. (Turnover/Employees 15th Aug.)) Samples might be drawn from any version before enumenation


Download ppt "Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD)"

Similar presentations


Ads by Google