1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.

Slides:



Advertisements
Similar presentations
Sampling: Theory and Methods
Advertisements

Chapter 5 One- and Two-Sample Estimation Problems.
Multistage Sampling.
Advanced Piloting Cruise Plot.
Chapter 6 Structures and Classes. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 6-2 Learning Objectives Structures Structure types Structures.
Introductory Mathematics & Statistics for Business
Chapter 1 The Study of Body Function Image PowerPoint
STATISTICS Sampling and Sampling Distributions
STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Detection of Hydrological Changes – Nonparametric Approaches
By: Saad Rais, Statistics Canada Zdenek Patak, Statistics Canada
Variance Estimation When Donor Imputation is Used to Fill in Missing Values Jean-François Beaumont and Cynthia Bocci Statistics Canada Third International.
The Application of Propensity Score Analysis to Non-randomized Medical Device Clinical Studies: A Regulatory Perspective Lilly Yue, Ph.D.* CDRH, FDA,
Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,
Linearization Variance Estimators for Survey Data: Some Recent Work
The Challenge of Integrating New Surveys into an Existing Business Survey Infrastructure Éric Pelletier Statistics Canada ICES-III Montréal, Québec, Canada.
My presentation will be on the use of paradata… By
1 Sharing best practices for the redesign of three business surveys Charles Tardif, Business Survey Methods Division,Statistics Canada presented at the.
Sampling Research Questions
NTTS conference, February 18 – New Developments in Nonresponse Adjustment Methods Fannie Cobben Statistics Netherlands Department of Methodology.
Summary of Convergence Tests for Series and Solved Problems
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Determine Eligibility Chapter 4. Determine Eligibility 4-2 Objectives Search for Customer on database Enter application signed date and eligibility determination.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Addition Facts
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.
Assumptions underlying regression analysis
STATISTICAL INFERENCE ABOUT MEANS AND PROPORTIONS WITH TWO POPULATIONS
Survey of Electronic Commerce and Technology: Past, Present and Future Challenges Jason Raymond Third International Conference on Establishment Surveys.
Possibilities of exploiting administrative data in short term statistics in Poland Jacek Kowalewski STATISTICAL OFFICE IN POZNAŃ.
Data Imputation United Nations Statistics Division (UNSD) 16 March 2011 Santiago, Chile.
Secondary Data, Literature Reviews, and Hypotheses
ABC Technology Project
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
1 Panel Data Analysis – Advantages and Challenges Cheng Hsiao.
VOORBLAD.
Squares and Square Root WALK. Solve each problem REVIEW:
Labour Force Historical Review Sandra Keys, University of Waterloo DLI OntarioTraining University of Guelph, Guelph, ON April 12, 2006.
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Week 1.
Statistical Inferences Based on Two Samples
We will resume in: 25 Minutes.
Validation of SGP4 and IS-GPS-200D Against GPS Precise Ephemerides
IP, IST, José Bioucas, Probability The mathematical language to quantify uncertainty  Observation mechanism:  Priors:  Parameters Role in inverse.
Chapter 11: The t Test for Two Related Samples
Simple Linear Regression Analysis
Multiple Regression and Model Building
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data.
Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada.
Administrative Data at Statistics Canada – Current Uses and the Way Forward 27 th Voorburg Group Meeting Warsaw, Poland André Loranger October 4, 2012.
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical.
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007.
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
A Theoretical Framework for Adaptive Collection Designs Jean-François Beaumont, Statistics Canada David Haziza, Université de Montréal International Total.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Unified Enterprise Survey New Horizons International Conference on Establishment Surveys Daniela Ravindra and Marie Brodeur Montreal, June 2007 Statistics.
Administrative Data at Statistics Canada – Current Uses and the Way Forward Wesley Yung and Peter Lys, Statistics Canada.
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude.
Presentation transcript:

1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII June 20, 2007

2 OUTLINE Introduction Current sampling design Current point estimators Alternative sampling design Alternative estimators Domain estimation When the tax variable is missing

3 TAX DATA PROGRAM Goal: To increase the use of tax data in business surveys in order to reduce the respondents burden reduce costs potentially improve the quality of point estimators

4 TYPES OF VARIABLES We distinguish between 3 types of variables Financial survey variables (total revenue, total expenditure, etc) Financial tax variables (total revenue, total expenditure, etc) Non-financial variables There is a direct link between the financial survey variables and the financial tax variables No direct link between non-financial variables and tax variables

5 TAX DATA 3 type of tax data: T 1 data: un-incorporated businesses (Unified Enterprise Survey) T 2 data: incorporated businesses (Unified Enterprise Survey) GST data: both incorporated & un-incorporated (Monthly surveys)

6 CURRENT SAMPLING DESIGN Stratification by Province, NAICS and Size 3 types of strata: Take-all strata (typically complex units) Take-some strata (simple and complex units) Take-none strata (simple units) Use of tax data is limited to take some strata (for simple units only) and take-none strata

7 CURRENT SAMPLING DESIGN STRATUM = PROVINCE x NAICS Eligible unitsNoneligible units

8 CURRENT SAMPLING DESIGN Advantages: The current design fits the imputation and estimation systems Disadvantages: It is a two-phase sampling design The sample sizes for collection in both the eligible and non- eligible strata are random variables may increase the variance of the estimators and add uncertainty to the collection costs The use of tax data is limited to the first-phase sample.

9 FINANCIAL VARIABLES Survey variables: y available only for the units in Tax variables: x available for all the units in For many financial variables, there is a corresponding tax variable We assume that both type of variables are known without errors (measurement errors, nonresponse) These two assumptions are not satisfied in practice!

10 CURRENT TAX REPLACEMENT METHODS Model describing the relationship between x and y: Special cases: Direct tax replacement: Ratio type replacement:

11 PREDICTED VALUES : predicted value for Direct tax replacement: (used in UES) Ratio type replacement: (used in monthly surveys) Estimate of is obtained from the units in s.

12 NOT ONLY DIRECT TYPE REPLACEMENT? Considerable efforts have been made to standardize the concepts and definitions between the tax variables and the survey variables (Chart of Account compliance for T1 and T2) As a result, we expect that the model should be valid. Sometimes, it is not because Difference in reporting of data and other issues (Jocelyn, Mach et Pelletier, 2006) Difference in the reference period (GST data)

13 CURRENT POINT ESTIMATORS: PREDICTION TYPE In the noneligible portion: Horvitz-Thompson estimator In the eligible portion: Prediction (or imputed) type estimator y is observed for all i in s is used for i in We have

14 CURRENT POINT ESTIMATORS: PREDICTION TYPE Advantages: Similar to imputed estimators in the context of imputation They are simple and fit the current imputation and estimation systems They fit the so-called micro approach for displaying the data Disadvantages: They are generally p-biased May be pm-biased if the tax replacement model is incorrectly specified

15 DISPLAYING THE DATA: MICRO VS. MACRO APPROACH We distinguish between two approaches for displaying the data: (i) Micro approach: consists of reporting the observed y-values as well as the predicted values (similar to an imputed file in the context of item nonresponse) (ii) Macro approach: consists of reporting the observed y-values along with a calibration weight Currently, the micro approach is used

16 PREDICTION TYPE ESTIMATORS Micro approach Unit Domain Domain estimators potentially p-biased and pm- biased

17 ALTERNATIVE SAMPLING DESIGN STRATUM = PROVINCE x NAICS Noneligible units Eligible units

18 ALTERNATIVE SAMPLING DESIGN Advantages: It is a single phase sampling design which simplifies the estimation procedures, particularly variance estimation The sample sizes are known prior to sampling Full use of available tax data is now made Disadvantages: The estimation systems need to be modified to fit the new procedure

19 POINT ESTIMATION For Financial variables, we have 2 options: Tax/survey based framework: We simply use for the eligible part and a design consistent estimator for the noneligible part Survey based framework: We want to estimate Use design consistent estimators (calibration estimators such as the GREG or optimal estimator) that make use of all the available tax data (Monthly surveys)

20 GREG TYPE ESTIMATORS The GREG estimator is usually written as The GREG estimator fits the macro approach but it can also fit the micro approach

21 GREG TYPE ESTIMATORS Micro approach Unit Domain Domain estimators asymptotically p-unbiased

22 DOMAIN ESTIMATION Three situations are encountered in practice: (i) The domain is identical with the model group (ii) The domain is contained in the model group (iii) The domain interesects more than one model groups

23 DOMAIN ESTIMATION Even if the prediction type estimators are pm-unbiased at the model group level, they could be significantly biased if the model prevailing at the domain level is different than the model prevailing at the model group level The GREG type estimators are always asymptotically p-unbiased at the domain level. However, they could be inefficient if the model prevailing at the domain level is different than the model prevailing at the model group level

24 DOMAIN ESTIMATION: MICRO vs. MACRO Macro and micro approaches lead to identical estimators of parameters at the model group level At the domain level, both approaches lead to different estimators No definite comparison is possible but we expect that will perform better than if the domain size is small

25 WHEN THE TAX VARIABLE IS MISSING In practice, the tax variable is subject to nonresponse and it is imputed Let z be a new variable defined as: x if the tax variable is observed and if the tax variable is missing Inference can be made conditional on z

26 FUTURE WORK Find a compromise calibration weight if the macro approach is used For non-financial variables, find the best set of auxiliary variables and use it to calibrate

27 Pour plus dinformations, veuillez contacter/ for more information, please contact (613)