Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.

Similar presentations


Presentation on theme: "Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia."— Presentation transcript:

1 Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia

2 Content of the presentation Introduction to the problem Application for sampling error estimation – basic principles Short description of the application Discussion

3 Introduction to the problem In the case of sampling surveys, standard error is still the most indicative “accuracy” indicator. It is obligation of the producer of official statistics to provide at least some information about the level of the accuracy together with the disseminated statistics Two main challenges: –How to correctly and timely estimate the standard error for the whole amount of the disseminated results – How to present these errors to the wide range of different users in clear and understandable way.

4 Standard error estimation at SORS (Not so far) past: –calculation of the sampling error was quite »survey dependent« → each survey had its own system –the direct estimations only for the key statistics and for the key domains → models for for the other statistics and (sub) domains –results with lower degree of precision were marked and the coefficient of variation was the “exclusive” criteria used Significant revision of the system few years ago: –The general rules were set up for the sampling error estimation –The new rules were set up for the dissemination and presentation –A special (sas) application was built in which all the above mentioned rules were incorporated

5 Application – general principles The application enables calculation of standard error for seven types of statistics. The application is usable for most of the statistics, produced at SORS, with few exceptions: –EU-SILC (Laeken) indicators (separate sas macro) –Indices (separate sas macro) The application enables aggregation, standard error calculation and also denotation with the special signs, if needed.

6 Application – general principles cont’d The application “merges” the processes of aggregation, sampling error estimation and tabulation into one fully automated process. It is designed as a metadata driven (MDD) system → parameters for the concrete survey provided outside the core computer code The application uses the following softwares: –The core part of the application (processing) is built in SAS environment, using PROC SURVEYMEANS “facilities” –The metadata are (for now) stored in Access database –Outputs are provided in the form of the excel tables

7 Application – technical description Hypothetical example Stratified one-stage sample Survey on internet usage in enterprises. Input variables: –Emp…Number of employees –Turn…Turnover –Wpage…Does the enterprise has its webpage (yes/no) –Nace2…Nace 2-digit group –Nace3…Nace 3-digit group –SizeC…Size class Output statistics –STAT01…Proportion of enterprises with its webpage –STAT02… Total turnover in enterprises with its webpage –STAT03… Turnover per employee in enterprises with its webpage Dissemination needed by the following domains –Nace 2-digit group –Nace 2-digit group * Size class Strata: –Nace 3-digit group * Size class

8 Metadata tables - Description of the statistics TableStat_codeStat_descTypeDummyVariableVariable_enVariable_den Table1STAT01Proportion of enterprises with its webpage02Dummy01 Table1STAT02Total turnover in enterprises with its webpage03 Var02 Table1STAT03 Turnover per employee in enterprises with its webpage05 Var02Var03 Type of statistics: 02 - Proportion 03 - Total 05 - Ratio Name of the Dummy variable needed for the calculation of the proportion (0,1 values) Name of the variable required for the calculation of the total Name of the variable in the enumerator, required for the calculation of the ratio Name of the variable in the denominator, required for the calculation of the ratio

9 Metadata tables – derived variables TableVar_nameConditionValue Table1Dummy01If Wpage='yes'1 Table1Dummy01If Wpage='no'0 Table1Var02If Wpage='yes'Turn Table1Var02If Wpage='no'0 Table1Var03If Wpage='yes'Emp Table1Var03If Wpage='no'0 Name of the derived variable needed Condition which determines for which units certain rule will be applied Value of the derived variable

10 Metadata tables – domains TableDomain_codeDom_var1Dom_var2…Dom_var10 Table1Dom1Nace2 Table1Dom2Nace2SizeC List of the variables which define the dimensions of the domain.

11 Metadata tables – sample design information TableStrataPSU Table1Nace3 Table2SizeC TableNace3SizeC_rate_ Table126.211 Table126.220.3 Table126.230.01 … Information on sample design (strata, PSU) Information on sample rate by strata cells

12 Metadata tables – other information Type of criteria used for the denotation of the statistics with lower precision Limits for the denotations of the statistics with lower precision Formats of the results of the final tables (decimals, percentages,…) Form and content of the output tables

13 Output – “raw results” Each row of the table gives the information on one aggregate. Dom1Dom_val1Dom2Dom_val2 … Stat_codeValueNo. of unitsSECVStat_diss Nace226.2 Stat0155.4232292.3944.3255.4 Nace226.3 Stat02757801.234102116625.5615.39116625_M Nace233.3SizeC3Stat03852.27325300.25635.23N Specification of domains Identification of statistics Information on estimated statistics Value to be disseminated

14 Output – formatted tables Proportion of enterprises with its webpage Total turnover in enterprises with its webpage Turnover per employee in enterprises with its webpage Nace 2 -digit groups 32.4124675340 56.585738 M N 45.5 M N578 …

15 Conclusions The application represents an important contribution to the process of the modernization of the statistical processes. It can be managed only by the subject matter personnel → significant rationalization of the survey execution. Planned improvements : –Development of the user interfaces for metadata management –Transfer of metadata database into ORACLE environment –Supplementation of the application functionalities with the possibility to estimate the sampling error for indices


Download ppt "Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia."

Similar presentations


Ads by Google