Redesigning French structural business statistics, using more administrative data ICESIII, Montréal, june 2007
Outlines General principles of the future device Methodological studies raised by the use of multiple sources
General principles of the future device Insee started in 2005 the redesigning of structural business statistics, with two main objectives : use of administrative data in a more systematic way (especially since they are available earlier than before): annual income statements, annual statements of payroll data, customs data use of the concept of « enterprise group », in business statistics, in a more important way
The components of the future system Administrative data Annual income statements Annual statements of payroll data Customs data A statistical mail survey, carried out on a sample of enterprises
Other administrative data GENERAL PRINCIPLES - CALENDAR Survey First results Definitive results 01/01 Tax data (1) Tax data (2) 31/12 Other administrative data
Many methodological studies are necessary for the implementation of the new system Four kinds of studies : The potential of administrative data The statistical estimates Data editing of different flows of data Renewing the questionnaire of the statistical survey
The potential of administrative data The « core » administrative data : the question is essentially the quality of identification of enterprises using the id-number of the business register Infra-annual data (about turnover and number of salaries) might be used early as proxy variables what about their quality ? Also, other administrative sources are under study, concerning restructuring of enterprises and financial links existing between enterprises
The final database Sample { Variables of the survey Sample { Variables of the survey Administrative variables : accounting data, « social » data, customs data, activity code of the register exhaustive
The statistical estimates (1) The estimates have to combine survey data and administrative data : how to use them in an optimal way ? For some estimates using only the variables of the statistical survey : For estimates mixing the two kinds of informations, it is more complex. Especially for sector-based estimates.
The statistical estimates (2) The breakdown of turnover (collected through the statistical survey) is used to assign the « principal activity code » to the enterprise This code is used for the sector-based estimates, as the turnover of a given economic sector
The statistical estimates (3) A « possible » estimator for the turnover of sector X : sampling weight value 1 if principal activity code in the register is equal to X, 0 otherwise idem with the activity code calculated in the survey
The statistical estimates (4) The values of the weights will be modified, compared to their initial value, by the calibration of the estimators: This may have consequences on the sampling plan
The data editing of different flows of data Different flows of data arriving at different dates First, the data editing of the survey data Is it possible to use the proxys given by infra-annual data ? Micro edits, and selective editing Then, the data editing of administrative data For selective editing, the fact that weights may be modified will have consequences
Renewing the questionnaire of the statistical survey What we learned from the present survey Test of different ways of collecting the information, especially concerning the breakdown of turnover Also, since the questionnaire will be sent early to enterprises, test of the way of collecting some informations (for example about some costs) even if accounts are not closed