Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada.

Similar presentations


Presentation on theme: "Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada."— Presentation transcript:

1 Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada June 18-21, 2007

2 2 Outline Overview of the UES Characteristics of the target population Current use of tax data At sampling At imputation At estimation Issues and Challenges Towards a better use of tax data Conclusion

3 3 Overview of the UES Unified Enterprise Survey (UES) started in 1997 Objectives Integrate all annual business surveys into one unified survey framework To produce quality financial and commodity estimates National and sub-national levels Industrial levels

4 4 Overview of the UES Target population All Canadian businesses within the covered industries The UES is an Establishment based survey Coverage over time 1997: Seven Industries 1998: Sixteen more (including Wholesale) 1999: Four more (including Retail) 2000: Four more (including Manufacture) …. 2007: Now covers over 60 major industries

5 5 Characteristics of the Target Population Divided into two main types of businesses: unincorporated (T1) and incorporated (T2) General Index of Financial Information (GIFI) data are available electronically for the entire T2 population T1 data are only available electronically for about half the T1s (e-filers)

6 6 Characteristics of the Target Population An enterprise is Complex: Multi-provincial and/or Multi- industry and/or Multi-legal Simple: The opposite An enterprise is also Single: Only one establishment Multi: More than one establishment Simple-Single enterprises represent about 95% of the population, although only about 40% of the economy

7 7 Current Use of Tax Data Why would someone use tax data? Improve efficiency of the sample design Reduce the response burden Reduce the collection cost Improve quality of the estimates

8 8 Current Use of Tax Data At sampling Some key variables taken from different tax files are put on the sampling frame Total Revenue, Total Expenses from GIFI Total Sales from Goods & Services Tax (GST) Salaries & Wages, # Employees from Payroll Deductions (PD7) Used to define a size measure (Total Revenue) for each establishment on the frame Used to stratify the population by size and to define the Take-None (T-N) portion

9 9 Current Use of Tax Data At imputation Used to replace survey data (financial variables) for a predetermined sub-sample of selected Simple-Single units Also used to replace survey data for some non-respondents Used as auxiliary data during imputation

10 10 Current Use of Tax Data At estimation GIFI data are used to produce estimates for all T2 units falling in the T-N portion T1 e-filer data are used to produce estimates for all T1 units falling in the T-N portion

11 11 UES Survey Design at a Glance T2 T2 Take-None: Census of GIFI EXCLUSION THRESHOLD Main sample to be surveyed For variables available from tax: Total estimate = Survey estimate (T1,T2) + T2 Take-None + T1 Take-none e-filer estimate For variables not available from tax (Characteristics): Total estimate= Survey estimate (T1, T2) Not eligible for tax : full questionnaire Tax replaced Characteristic quest. (services surveys) or full questionnaire (other surveys) T1 Main sample to be surveyed T1 Take- None: Sample of e-filers

12 12 Issues and Challenges At sampling Sometimes we get inconsistent tax data Ex: GIFI Total Revenue=$2M GST Total Sales=$25M What do we do? We use a conservative approach, i.e. we take the maximum We manually verify and adjust the extreme cases (we’ll make use of survey data if available)

13 13 Issues and Challenges At sampling (cont’d) Sometimes all we get is # Employees or Salaries & Wages (Revenues =. or $0) What do we do? We model Total Revenue using what’s available

14 14 Issues and Challenges At imputation Sometimes we can’t find the link to tax data (ex.: not-for-profit organizations) Sometimes we link to 2 or more tax files We currently use direct tax replacement (i.e. Y survey = X tax ). Should we instead use a modelling approach (i.e. Y survey = f(X tax )? Studies have shown that in some cases it might be more appropriate to use f(X)

15 15

16 16 Issues and Challenges At estimation Currently, we use the one-phase Horvitz-Thompson estimator It’s a very simple, and fairly efficient estimator Unfortunately, it could be severely biased if the model y = x doesn’t hold

17 17 Issues and Challenges At estimation (cont’d) Estimates for variables not available from tax file (characteristics/commodity) do not cover the T-N portion For some characteristics the T-N portion can count for a lot more than 10%

18 18 Issues and Challenges Data quality Response rates (What is a respondent?) Respond to tax but not to the characteristic questionnaire Reported tax data vs imputed tax data Planned tax replacement vs tax replacement for non- response Variance & CV A lot of imputation occurs in the current strategy (incl. tax replacement) Shouldn’t we include the variance due to imputation?

19 19 Towards a Better Use of Tax Data Understand the particularities of the different tax data sources (ex.: GST vs T2 is currently under investigation) Explore different administrative files to help with particular sub-populations (ex.: not-for-profit organizations)

20 20 Towards a Better Use of Tax Data Keep investigating why Y survey ≠ X tax even when they should conceptually be equal Explore the idea of using Y survey = f(X tax ) Fine-tune our definition of who is eligible for tax replacement and who is not Currently studying the possibility of using a more robust estimator to protect against the potential bias Developing a strategy to cover the entire population for all variables of interest

21 21 Start taking into account the variability introduced by imputation when computing variances and CVs A framework is under development to define response rates when both tax data and survey data are used for the same units Explore the possibility of making use of all the GIFI data, not only for the T-N and the sample Towards a Better Use of Tax Data

22 22 Towards a Better Use of Tax Data T2 T2 Take-None: Census of GIFI EXCLUSION THRESHOLD Main sample to be surveyed For variables available from tax: Total estimate = Survey estimate (T1,T2) + T2 Take-None + T1 Take-none e-filer estimate For variables not available from tax (Characteristics): Total estimate= Survey estimate (T1, T2) Not eligible for tax : full questionnaire Tax replaced Characteristic quest. (services surveys) or full questionnaire (other surveys) T1 T1 Take- None: Sample of e-filers EligibleIneligible

23 23 Conclusion Since the introduction of the UES, the use of tax data has increased consistently It has significantly reduced response burden and the cost of the survey Unfortunately, sometimes at the expense of a reduced data interpretability Fortunately, it was recently decided that we would take a few steps back to evaluate how we currently do things, and to determine how we could improve our strategy

24 For more information please contact Pour plus d’information, veuillez contacter Visit our web site at Claude Turmelle (613)


Download ppt "Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada."

Similar presentations


Ads by Google