Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007.

Similar presentations


Presentation on theme: "The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007."— Presentation transcript:

1 The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007

2 Overview Background Automatic Coding Manual Coding Quality Evaluation of Classification Updates Quality Assurance Survey Conclusion

3 Background STC’s Business Register Redesign Improve administrative data link Improve treatment of births/deaths Reflect the businesses reality Give update privileges to a larger set of people Develop a quality assurance program Part of the quality assurance program is ensuring good industrial classification

4 Background Good industrial classification Leads to better population identification Leads to smaller sample size Leads to reduced collection cost Leads to better precision Prevents frustration from respondents (and interviewers)

5 Background Business Register Statistics Canada

6 Background Business Register Canada Revenue Agency Statistics Canada

7 Background Business Register Canada Revenue Agency Automatic Manual Statistics Canada

8 Background Business Register Updates Canada Revenue Agency Automatic Manual QE Statistics Canada

9 Background Business Register Updates Canada Revenue Agency Automatic Manual QE QAS Statistics Canada

10 Automatic Coding New businesses apply for a Business Number (BN) (done at Canada Revenue Agency - CRA) In person, over the phone, over the internet,... What is the description of the main Business activity? Decision tree tool used by CRA Prompts for details needed for coding Returns a robot-phrase to Statistics Canada

11

12

13

14 Automatic Coding Assign classification based on robot-phrase Improving decision tree tool and usage Re-developed on micro (originally mainframe) Expand use for Web BN application (currently used for phone or in person registration) Develop questions for all sectors Currently used for 75% of all industrial sectors Covers 90% of all descriptions to be coded

15 Automatic Coding Automated Character Text Recognition (ACTR) If description too general  Manual coding Used to assign classification based on descriptions Reference file (French and English) Parsing strategy Word weighting algorithm Score derived

16 Automatic Coding Improving use of ACTR Improve reference file Each year new phrases are added Currently 7 000 phrases Study score needed for match Opening the weighting algorithm Improve parsing rules Revisit the rules Create an environment for testing purposes Evaluate impact of changing input/rules/score

17 Automatic Coding 40 000 new businesses a month to code 45% are coded using robot-phrases 5% are coded using ACTR Leaves 20 000 new businesses to code Need manual coding Done at Statistics Canada

18 Manual Coding Other units to code manually Survey feedback New operating entity found when profiling Tool Search engine for industrial coding Improve manual coding Add on-line ACTR or ACTR results Add decision tree tool

19 Manual Coding New businesses Goal: code all of them Reality: do as many as we can Result: backlog of businesses to code

20 Manual Coding New businesses Goal: code all of them Reality: do as many as we can Result: backlog of businesses to code Business Register Automatic Manual Automatic CRA May batch CRA June batch Backlog Manual

21 Manual Coding Which units should be coded first? First in, first out? Economic activity signal? Economic activity is determined by administrative data Both! Select a sample from backlog Take-all (large economic activity) Take-some 1 (economic activity / older units) Take-some 2 (economic activity / newer units) Take-none (no economic activity )

22 Manual Coding Prioritize units to code Can produce under-coverage estimates of the backlog by industrial sector Ultimate goal Improve automatic coding 80% - 90%? Code all remaining active units

23 Quality Evaluation of Classification Updates Update privileges will be expanded Subject-matter specialists Collection personnel Need to evaluate the quality of updates Prevent systematic errors Where to focus training

24 Quality Evaluation of Classification Updates Two processes Notification and sample selection 1- Notification Specialist determines set of enterprise to look at Every update to targeted enterprise is sent to specialist Agree/Disagree/Do nothing Make use of expertise of specialist Specialists keep up-to-date with their frame

25 Quality Evaluation of Classification Updates 2- Sample selection and evaluation Based on industry, source of industry, size and complexity of enterprise Re-code and compare Minimize respondent input when re-coding Using notification and sample Produce error rate for industrial coding Target specific problems

26 Quality Assurance Survey Goal: assess the quality of classification on the BR on an on-going basis Assess dead/alive status as well Point in time surveys done in the past 1993, 1995, 1997, 2002 Implement a continuous survey Produce overall results monthly Produce detailed results combining 12 months

27 Quality Assurance Survey Stratification Industrial sectors 2 or 3 size stratum Have higher sampling fraction for larger size Recently contacted Considered to have valid classification Sample allocation Target 3.5% standard error for annual industrial classification error rate 550 units a month

28 Quality Assurance Survey Currently doing a pilot test Monthly estimates produced Yearly estimates based on weighted average of 12 monthly measures Weighted average based on 1/12 Weighted average based on population ratio over the year (N m /(N 1 +...+N 12 ))

29 Quality Assurance Survey Survey will be used to Clean-up the register as an independent source Evaluate industrial in and out-of-scope rate Evaluate industrial error rate for non-surveyed portion of the register (e.g. small enterprises) Evaluate death rate in order to adjust sample sizes Potential use Evaluate frame quality for new surveys Clean-up part of the register

30 Conclusion Classification is essential to the BR Redesign provides an opportunity To improve coding To standardize tools used for coding To measure quality of coding adequately To set-up good practices/good reports Results Better quality of business survey frames More efficient surveys

31 For more Information please contact Pour plus d’information, veuillez contacter Visit our web site at www.statcan.ca Yanick Beaucage 613-951-4622 yanick.beaucage@statcan.ca


Download ppt "The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007."

Similar presentations


Ads by Google