Presentation is loading. Please wait.

Presentation is loading. Please wait.

Coding occupations The new coding process Sue Westerman, Marc Houben.

Similar presentations


Presentation on theme: "Coding occupations The new coding process Sue Westerman, Marc Houben."— Presentation transcript:

1 Coding occupations The new coding process Sue Westerman, Marc Houben

2 Why change current process? New coding process + data collection
Structure Why change current process? New coding process + data collection Features of Cascot Test results Pros and cons Statistics Netherlands - Coding occupations, the new process

3 Why change our coding process
Redesign social surveys CAWI / CATI / CAPI Three modes one questionnaire Shortening of the interview time Legislative obligations to deliver data given priority IT policy No custom-made software applications, only standard tools A few years ago statistics netherlands started a project that aimed to redesign the social surveys more efficiently in order to reduce the costs. This was achieved in several ways. With respect to measuring occupations the following decisions that were taken in this project were relevant: A webbased interviewing technique will be used next to the personal and telephone interviewing with the same questionnaire in all three modes The total interviewing time had to be reduced. And it was decided to give priority to the data that we are legally obliged to deliver. IT-policy aimed to use standard tools in our statistical process and minimize the costs for maintenance and use of custom-made software applications. Statistics Netherlands - Coding occupations, the new process

4 Computer assisted Coding by Interviewers semi-automatic, interactive
Current system Computer assisted Coding by Interviewers semi-automatic, interactive Interviewers need training Interview time varies In our current system codes are assigned during the interview interactively with the respondent in a semi-automatic way. Interviewers need a one-day training to get to know this system. The system was also unsuitable for implementation in the CAWI-mode and an adjustment of the system would be very costly. Besides this, the time needed to assign a code depends very much on the quality of the given answers to the questions on the occupations. Statistics Netherlands - Coding occupations, the new process

5 Overview of new process
Data collection Data processing This is a high-level overview of our new process. Coding is not done anymore during data collection. Coding is done in the data processing process. A specific survey – e.g. the LFS - collects its data which is then stored in our data collection storage. This data is then send on to the data collection process. Beside that the data collected concerning occupation is fetched by our new coding process. This process is a generic process and codes the occupation data in exactly the same way for each survey. This coding process uses a coding database where in the end the coding results – so the ISCO-codes – are stored. Thereupon, the data processing process of e.g. the LFS fetches its coded occupations, the ISCO-codes, and couples it with its other data. In will now tell a little bit more about the data collection and about the coding process. Statistics Netherlands - Coding occupations, the new process

6 Data collection for occupation
Questions Jobtitle? - Open question Some questions about managerial occupations distinction between managers and supervisors Main tasks? - Open question We designed a new set of questions regarding occupation. There are two open questions, one about the jobtitle and one about the main tasks. And there are some questions specific concerning managerial occupations. That is because in the past we had some difficulties of coding these occupations and also for coding the ISCO it is necessary to have some deatailed information for coding managerial tasks correctly. E.g. to make the distinction between managers and supervisors. We are aware of the risk with open questions that you don’t get enough detailed answers for coding. We overcame this by adding examples to the questions. So as a note we put e.g. “Don’t answer ‘Mechanic’ but ‘Garage mechanic’” or “Don’t put just ‘Manager’ but ‘Manager ICT’”. We have seen – in an LFS-pilot – that this helps; respondents are willing to put more detail. Statistics Netherlands - Coding occupations, the new process

7 Coding process This is an overview of our coding process. It exists of 4 steps. The first 3 steps are fully automatic coding. Step 4 is the manual coding. In step 1 coding is done on jobtitles. Records wich could not be coded in stap 1 are attempted to code in step 2. Where next to the jobtitle also the main task is used. In the third step some specific jobs that can not be coded in step 1 and 2 are converted into ISCO-codes by using additional variables. This is where the answers to the managerial questions are used. But also the code of the economical activity of the company where the respondent is working, is used. In the first 3 steps the ISCO is coded on 4 digits. In step 4 it is also possible to code on less digits. As you see, in step 1 , 2 and 4 we use a coding tool, namely Cascot. Cascot is developed in the UK by the University of Warwick. It was also used in the EurOccupations project. Probably not all of you are familiar with Cascot. So I will tell you some features of Cascot. Statistics Netherlands - Coding occupations, the new process

8 Cascot features Classification independent Uses a classificationfile
Classification (e.g. ISCO08) Index (search phrases) Coding rules Inputfile -> coding per record -> Outputfile Result: code + score Batch version + interactive version Cascot is a generic coding tool. That means that it can be used for coding all kinds of classifcations. E.g. we will also use it for coding education. So, what you do is making a so called classificationfile. This contains the classification itself and teh search phrases en coding rules which are used to code a text to a classification code. Cascot is file-based. So you need an inputfile, Cascot codes record per record and the results are put in an outputfile. The result is the code, ISCO-code, an a score between 1 and 100. Score is probability, but gives also insight in quality. And of course better coding quality is achieved by a good index + good rules On request of Statistics Netherlands Warwick made a batch version of the tool. The is because we wanted step 1 and 2 in our coding process to run fully automatically. But the batch version is also available for other customers. Statistics Netherlands - Coding occupations, the new process

9 Results LFS-pilot: cawi, cati, capi
74% automatically coded (step 1, 2, 3) > 90% coded correctly So, 26% will be coded manually (30% in current situation) This summer we carried out a pilot LFS. Pilot objective was - among other things – testing the new questions concerning occupations, and mainly to see how it worked in cawi-mode. We had about 4000 households in the sample. We used 3 modes: cawi, capi and cati. I will not elaborate on the design and the general results of this pilot. But we did use the collected data to test our new coding process. The results were…see sheet We were very satisfied with these results and based on this we decided that we are going to use the new developed coding process in production. From the 1st April 2012 it will be used in the LFS in a parallel survey next to the regular survey. And as of 1st October 2012 in the regular LFS process. Statistics Netherlands - Coding occupations, the new process

10 Pros and cons + Less interviewing time
+ No training needed for interviewers; no divergence of interviewer interpretations - No feedback to respondent + Good balans of auto coding % and code quality - Still manually coding necessary + Same system for auto and manual coding (same index, rules) + Same questions and coding process for cawi, cati, capi + …and also for all surveys An advantage of the system is that we have reduced the interviewing time, and the costs needed to train the interviewers to learn how to assign codes. We now also avoid the risk of divergence in the assignment of occupational codes because of interviewer interpretations. However a disadvantage of back-office coding is that no feedback can be given to the respondent when giving vague job titles. We have a good balans of automatic coding percentage and coding quality, however manual coding by a coding expert is still necessary. Nevertheless we are content that we achieved developing a coding process that is the same in cawi, cati and capi-modes using the same questions, and that uses a system that is the same in automatic coding and manual coding process. We think that the advantages are a decisive factor. That means that – within Statistics Netherlands - the new coding process will not only be used with the LFS but with all surveys where occupation is asked. Statistics Netherlands - Coding occupations, the new process

11 Questions? (statistician, team educational and occupational classifications) (project manager and business process designer, team development and support) Statistics Netherlands - Coding occupations, the new process


Download ppt "Coding occupations The new coding process Sue Westerman, Marc Houben."

Similar presentations


Ads by Google