The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi.

Slides:



Advertisements
Similar presentations
Innovation data collection: Advice from the Oslo Manual South East Asian Regional Workshop on Science, Technology and Innovation Statistics.
Advertisements

Innovation Surveys: Advice from the Oslo Manual South Asian Regional Workshop on Science, Technology and Innovation Statistics Kathmandu,
Innovation Surveys: Advice from the Oslo Manual National training workshop Amman, Jordan October 2010.
Do Economic and Demographic Characteristics Differ between Web and Mail Respondents to the 2005 Census of Agriculture Content Test? By Nancy J. Dickey.
Labour market statistics in Poland. Labour supply, labour demand employment, job vacancy, unemployment Current statistics How we collect the data Household.
Riku Salonen Regression composite estimation for the Finnish LFS from a practical perspective.
Quality assurance -Population and Housing Census Alma Kondi, INSTAT, Albania.
Quality in Italian consumer price survey: optimal allocation of resources and indicators to monitor the data collection process Federico Polidoro, Rosabel.
2006 August Labour statistics The usage of administrative data sources for Lithuanian data of earnings Milda Šličkutė-Šeštokienė Statistics Lithuania.
1 Editing Administrative Data and Combined Data Sources Introduction.
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
FARMS MULTIFUNCTIONALITY AND HOUSEHOLDS INCOMES IN SUSTAINABLE RURAL DEVELOPMENT Session 4: Income and Employment of the Rural Household By Marco Ballin.
Using survey data collection as a tool for improving the survey process Silvia Biffignandi, Antonio Laureti Giulio Perani University of Bergamo Istat Istat.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
The converging pattern between Business statistics and Administrative data. Towards an “industrialized” statistical production process The Italian LCS2012.
Carmela Pascucci – Istat - Italy Meeting of the Working Party on International Trade in Goods and Trade in Services Statistics (WPTGS) Linking business.
Electronic reporting in Poland 27th Voorburg Group Meeting Warsaw, Poland October 1st to October 5th, 2012 Central Statistical Office of Poland.
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (28-30 October 2009) Accuracy evaluation of Nuts level 2 hypercubes with the adoption of.
© Federal Statistical Office, Research Data Centre, Maurice Brandt Folie 1 Analytical validity and confidentiality protection of anonymised longitudinal.
Q2010, Helsinki Development and implementation of quality and performance indicators for frame creation and imputation Kornélia Mag László Kajdi Q2010,
European Conference on Quality in Official Statistics Different Quality Tests on the Automatic Coding Procedure for the Economic Activities Descriptions.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-Sébastien Provençal,
The Italian new survey COEN – 2011: an innovative editing procedure, Giovanni Seri – Nuremberg, 11/09/2013 The Italian new survey on Enterprises Final.
Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Q20101 National accounts revisions: Italian manufacturing productivity analysis Alessandro Faramondi Istat – National Statistical Institute.
Imputation and Editing of Income from the Administrative File in Israel ’ s Censuses prepared by Orly Furman and Dmitri Romanov UNITED NATIONS ECONOMIC.
1 Seminar on 2008 SNA Implementation April 2011, Addis Ababa, Ethiopia Gulab Singh UN Statistics Division/ DESA DIAGNOSTIC FRAMEWORK: National Accounts.
BES Equitable and Sustainable Well-being in Italy
European Conference on Quality in Official Statistics Session 26: Quality Issues in Census « Rome, 10 July 2008 « Quality Assurance and Control Programme.
Developing Business Data Collection Nordic Statistical Meeting in Copenhagen August 2010 Theme 3. Statistics Production Johanna Leivo
The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.
Using administrative registers in sample surveys European Conference on Quality in Official Statistics 3-–6 May 2010 Kaja Sõstra Statistics Estonia.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 5.
Editing of linked micro files for statistics and research.
United Nations Statistics Division Work Programme on Economic Census Vladimir Markhonko, Chief Trade Statistics Branch, UNSD Youlia Antonova, Senior Statistician,
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
1 For a Population Statistical Register Characteristics and Potentials for the Official Statistics Central department for administrative data and archives.
26 April 2010 The unadjusted gender pay gap in the EU Didier Dupré, Eurostat unit F2 UNECE Work Session on Gender Statistics.
Experience and response in developing countries: the twinning project with the Tunisian National Statistical Institute Monica Consalvi ISTAT, Division.
Evolution of Census Statistics on Enterprises in Italy : from the Traditional Census to a Register of Local Units Monica Consalvi, Luigi Costanzo,
Towards an improvement of current migration estimates for Italy Domenico Gabrielli, Maria Pia Sorvillo Istat - Italy Joint UNECE-Eurostat Work session.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
1 Regional Seminar on 2008 SNA Implementation June 2010, Antigua, Antigua and Barbuda Gulab Singh UN Statistics Division/ DESA DIAGNOSTIC FRAMEWORK:
Joint Eurostat Unece Worksession on Statistical Data Confidentiality 2011, Tarragona Initial analyses on comparable dissemination from the Essnet project.
An Overview of Editing and Imputation Methods for the next Italian Censuses Gianpiero Bianchi, Antonia Manzari, Alessandra Reale UNECE-Eurostat Meeting.
Interstate Statistical Committee of the Commonwealth of Independent States (CIS-STAT) CES seminar “Challenges for future population and housing censuses.
Meeting of Task Force on Small and Medium Sized Enterprise Data (SMED ) 13 th April 2015, 10:00-17:00 Inclusion of all economic sectors in SBS Giampiero.
R&D statistics in Denmark organization of data collection, and dissemination of R&D statistics.
Administrative Data and Official Statistics Administrative Data and Official Statistics Principles and good practices Quality in Statistics: Administrative.
Compilation and Dissemination of Distributive Trade Statistics
DIAGNOSTIC FRAMEWORK: National Accounts and Supporting Statistics
Implementation of a more efficient way of collecting data SBS: use of administrative data Statistics Belgium June 2009.
Dublin, april 2012 Role of Business Register in coordinated sampling
Survey phases, survey errors and quality control system
ESTP COURSE ON PRODCOM STATISTICS
Working Group on Labour Statistics for MEDSTAT countries October 2013
Survey phases, survey errors and quality control system
Organization of efficient Economic Surveys
Labour Price Index Labour Market Statistics (LAMAS) Working Group
SES 2014 IN SLOVENIA Miran Žavbi, SURS.
Regional Seminar on Developing a Program for the Implementation of the 2008 SNA and Supporting Statistics Gülçin ERDOĞAN September 2013 Ankara.
Sampling and estimation
Parallel Session: BR maintenance Quality in maintenance of a BR:
Presentation transcript:

The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi Stefanio De Santis ISTAT June 2014 European conference on quality in statistics Q2014

Presentation  Description of the survey SES  Introduction to the re-engeneering of the flow  Focus on the direct survey flow: Edititng and Imputation process (E&I)  Selective editing  imputation by administrative data  Results of the direct survey: quality indicators and analysis of by mode

Description of the survey SES  Final aim: to gather micro-data about individual and economic characteristics of a representative sample of employees to obtain detailed and comparable information on relationships between the level of remuneration and individual characteristics of employees and their employer  Observation unit: the enterprise, through which the data are obtained on a representative sample of their employees  Coverage: economic activities from B to S of NACE Rev. 2 on business with 10 or more employees (the section O and units with less than 10 employees are optional)  Sample design: two stage sampling design - a sample of employees nested in a sample of enterprises. The size of the second stage sample varies according to the enterprise size, from 10 – 19 employees a module for every employee, till 200 modules required to the biggest size-class (more than employees)  Collected variables: sex, age, occupation (ISCO08), education (ISCED 97), full time/part time, mean monthly gross earnings in the reference month, mean annual gross earnings, number of hours paid includes all normal and overtime hours worked and remunerated by the employer during the reference month

Description of the survey SES Starting frame:  Private sector: the official register of active enterprises defined by Istat (Asia). From this frame a random stratified sample is drawn on business with 10 till 249 employees, it has been stratified according to the size class, Nace Rev. 2 groups and macro geographical area (NUTS1). On the other side a census is done on units with more than 250 employees  Public sector (school excluded): the starting frame is the official Istat list of the Institutional units belonging to the PA sector, it is supplied with several information about every units but the number of employees. On this list a census is drawn  Public School: the universe representing the starting frame has been elaborated on the basis of an integration of administrative data sources. Hence, the sample design of employers and employees belonging to the public Education sector has been realized through the integration of statistical archives provided by LFS and EU-SILC as well as cross linking techniques with administrative and fiscal data.

Re-engeneering of the flow The process has been re-engeneerized in order to improve the results in terms of:  coverage  response rate  quality of the microdata and fo the final estimates

Re-engeneering of the flow Public sector \ schoolPrivate sector Public School Direct Survey Integration of Administrative, Fiscal and Statistical archives Final sata set edited and imputed data Statistical matching Imputation of partial missing data Statistical matching Imputation of partial missing data

back office registration paper ad hoc file web Online interactive editing : Is there any problem ? s tatus  check ok  check not ok  paper s tatus  check ok  check not ok  paper Selective editing Tested for being potential influential error interactive editing Edited data Automatic editing no yes RESPONDENTS no si NON RESPONDENTS selection of non response in critical strata Estimation of basic variable by admisnitrative sample

Direct survey: coverage Questionnaire by sector and mode 42% 51% in most cases the full answer were gain about the sample of employees resulting in a very huge amount of microdata : about , compared to the of the previous session of the survey

final coverage by imputaion Imputed enterprises by administrative data by size class and geographical area

E&I process: first step – selective editing selective editing: to detect influential errors i.e. to search for those units particularly abnormal and dangerous, according to the values ​​ of some variables of interest, which may affect the final estimates selected enterprise by size

Direct survey and E&I: quality indicators of the raw data and analysis by mode  To evaluate the efficiency of the interactive editing developed to be used on the web questionnaire  To asses whether the mode used implies different quality of data

enterprises by size and mode enterprises by geographical area and mode enterprises by economic activity and mode Frequency table of mode by…

Chi squared on … Test of Chi squared has been run, in order to test whether there is association between the chosen mode and given characteristics. For each of the previous table the test has been calculated and the results are in the following table: Because the null hypothesis is lack of association, the diagnostic give us results to reject it in any model has been tested.

Quality of data according the mode Synthetic table of percentage of fails by mode 17 edits out of the number 87, the most common failed edits

Classification regression tree a classification method has been used to identify which are the main components to determine the difference in recording fails in edits basic idea: to model as dependent variable the number of failed explained by the independent variables size geographical area, economic activity and mode y ~ mode + geographical area + size + economic activity a classification regression tree method has been used: it divide the units of the starting dataset into groups in a recursive way to reach a final tree with nodes and attributes the tree: to describe which are the variables that mainly determine those differences

Classification regression tree

conclusion and comments at first glance… 1.The use of the web and the interaction it allows give a good instrument to the statistical manager to handle the needs and problem of the respondents, resulting in a better quality and quantity of data 2.The use of the web does not resolve the problem of the total non response! 3.The use of administrative data as auxiliary information is becoming essential to assure a good coverage for the final estimates 4.An efficient editing design that would take into account different tool and method could guarantee a good quality data into budget constraints

…and development 1. to optimize the use of IT tools for the direct survey with the web questionnaire 2. to extend the use of administrative data in order to fully integrate them with survey data only were necessary, it means for the not available data from the administrative sources 3. to optimize the use of selective editing also for other, using as auxiliary variable also the ones from the administrative data 4. in the case there the will to give possibility to choose between different modes, the information about how it interact with the enterprise characteristics can be used in a very innovative way to better study a proper sample

thank you for your attention