Presentation is loading. Please wait.

Presentation is loading. Please wait.

IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia,

Similar presentations


Presentation on theme: "IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia,"— Presentation transcript:

1 IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia, Statistics Finland, ISTAT, Statistics Lithuania, ONS

2 Imputing missing admin data for STS-estimates 1 Outline of the presentation Scope of the project - use of admin data for STS Two situations: a.VAT fairly complete and representative - VAT representative b.VAT not complete and not-representative - VAT not representative VAT representative a.imputing missing values Imputing missing values a.methods for imputations b.which units to impute Conclusions and implications for other projects

3 2 Scope of the project Final situation: (after year) - all admin data are available for NSIs - data cover the population Monthly and quarterly estimates: Part of admin data are ‘missing’ L.E. (survey) admin data L.E. (survey) admin data Missing Assumption If admin data are complete, possible to use for statistics Challenge How to estimate for ‘ missing’ admin data in case of monthly and quarterly estimates Scope: turnover (VAT-registration), wages+employees (“social security data”) Imputing missing admin data for STS-estimates

4 3 Additional Value of ESSnet AdminData VAT = Value Added Tax The European Union value added tax (EU VAT) is a value added tax encompassing member states in the European Union VAT area. Joining in this is compulsory for member states of the European Union. Each Member State's national VAT legislation must comply with the provisions of EU VAT law as set out in Directive 2006/112/EC. TRANSLATION TO STATISTCS INPUT: Available VAT-information quite similar in Europe ! OUTPUT: obligations also similar in Europe (STS, SBS. ESR regulations) CONCLUSIONS ESSNET: methodological challenges in use of admin data indentical -> solution may differ, but only limited Imputing missing admin data for STS-estimates

5 4 Two situations Situation A: L.E. (100 % sample)L.E (100 % sample) VAT Almost complete VAT Not available or very limited GENERAL SITUATION FOR Q; t+45days GENERAL SITUATION FOR M; t+30 days SITUATION A. or B. FOR OTHER ESTIMATES (Q-flash; M-T+45/50d) DIFFERS PER COUNTRY Situation B:

6 experimental meth. NOT DISCUSSED FURTHER established techniques Level estimates Imputation of missing data (with available VAT) 100 % sample Admindata Final situation 100 % sample Admindata Missing STS SITUATION A: Admindata coverage almost complete ESTIMATION ONLY BASED ON ADMIN DATA SITUATION B: Admindata coverage incomplete ADMIN DATA = AUXILIARY INFORMATION sample VAT ESTIMATI ON VAT sample QUALITY STS-ESTIMATES: Revision compared to final estimate average bias: average error: L.E. SME Methods Situation A: methodology VAT T-x

7 6 Methods for imputations Analysed several production systems: i.e. DE, F, “Nordic countries’, NL, I Imputation of “missing VAT” based on: O t /O t-1, O t /O t-12 of available VAT – or similar approaches Stratification levels for calculation stratum imputations differ from NACE 2-digit x 2-size classes to NACE 4-digit x 9 size classes KEY QUESTION: Do these different approaches lead to different output, because methods are generally applied when coverage of L.E. survey + available VAT exceeds 90 % of target variable ? Imputing missing admin data for STS-estimates

8 7 Methods for imputations – testing of different methodologies (example Estonia) Conclusion: Imputation method provide similar results if the population is fixed and VAT covers > 80 % of population Imputing missing admin data for STS-estimates

9 8 Comparing imputations with realisations (approach Statistics Finland) Five imputation rules for current period at mico-level Imputation rules automatically evaluated and compared by calculating maximum proportional forecast errors using data concerning the five latest months. The selection rules are: An imputation rule < 20% maximum proportional forecast error and the same direction of change as in the last two months is automatically admissible; The model with the smallest maximum error is considered best Main difference with other detected practices: No assumption; available VAT = representative Not all missing data imputed (in practice 20 - 50 %) Imputing missing admin data for STS-estimates Mean annual change Geometric mean of monthly changes Previous turnover Mean turnover Turnover of comparison month

10 9 Comparing imputations with realisations (more precise conclusions) Imputing missing admin data for STS-estimates Explanations: - Outlier effect on calculated O t /O t-1 or O t /O t-12 values - Late VAT-reporters are likely a selective group in countries with automatic fining systems in case of late VAT-reporting. impact of selectivity on output is generally neglible due to high coverage available data

11 10 Which units to impute Imputing missing admin data for STS-estimates

12 11 Impact on results example Italy Imputing missing admin data for STS-estimates imputation technique uncert. provisional population Conclusion: effect on revision caused by uncertainty of units to be imputed is larger than imputation technique itself

13 12 Conclusions When using Admin Data for STS missing data are imputed Most widely used imputation rules are: O t /O t-1 or O t /O t-12 Taking into account large coverage of available data exact chosen imputation technique has only limited impact on outcome, despite the indication that the main assumption of the used techniques “available VAT = representative” might not be 100 % correct. More important than the imputation technique = estimate for provisional population Imputing missing admin data for STS-estimates


Download ppt "IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia,"

Similar presentations


Ads by Google