Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with EU-SILC: data files, variables and data management Practical computing session I – Part 1 Heike Wirth GESIS – Leibniz Institut für Sozialwissenschaften.

Similar presentations


Presentation on theme: "Working with EU-SILC: data files, variables and data management Practical computing session I – Part 1 Heike Wirth GESIS – Leibniz Institut für Sozialwissenschaften."— Presentation transcript:

1 Working with EU-SILC: data files, variables and data management Practical computing session I – Part 1 Heike Wirth GESIS – Leibniz Institut für Sozialwissenschaften DwB-Training Cource on EU-SILC, February 13-15, 2013 Romanian Social Data Archive at the Departement of Sociology University of Bucharest, Romania

2 EU-SILC datasets EU-SILC Variables Differences between Data collected & anonymised User Database (UDB) Hands on Transform CSV-File into SPSS/Stata-Systemfile number of households/persons in the file 2 Overview

3 Four separate files Household ( = 1 observation per household)  Register data (D)  Household data (H) Individuals (= 1 observation per person)  Register data (R)  Personal data (P) Since cross & longitudinal data are provided separately => 8 files 3 EU-SILC Data

4 For example: UDB_c10D_ver 2010-1 from 01-03-12.csv UDB_c10H_ver 2010-1 from 01-03-12.csv UDB_c10R_ver 2010-1 from 01-03-12.csv UDB_c10P_ver 2010-1 from 01-03-12.csv _c = cross; _l = longitudinal 10 = year of the survey = 2010 D = Household Register File H = Household Data File R = Personal Register File P = Personal Data File 2010-1= version of the data (e.g. 1 st version of the 2010 data) csv = type of data (=comma separated values) 4 EU-SILC Data

5 Household Register File (D) one record for every household including information regarding sample units, household weights, etc e.g. UDB_c10D_ver 2010-2: N = 225 972 households Household Data File (H) one record for every household including household data e.g. UDB_c10H_ver 2010-2: N = 225 972 households Personal Register File (R) one record for every person currently living in the household or temporarily absent e.g. UDB_c10R_ver 2010-2: N = 576 531 persons Personal Data File (P) Reference population: members of the household aged 16 and over e.g. UDB_c10R_ver 2010-2: N = 476 705 persons 5 EU-SILC Data

6 6 Domains & Areas - Households Source: Guidelines_Doc65_2010.pdf, p.73

7 7 Domains & Areas - Persons Source: Guidelines_Doc65_2010.pdf, p.73

8 Variable names in EU-SILC are composed of 3 parts: 1st character refers to the dataset (D; H; R; P) 2nd character refers to the domain 3 digits represent a sequential number e.g. PE040 = Highest ISCED Level attained Most important piece of data documentation: Guideline ‘Description of Target Variables’ refers to variables delivered by the NSIs to EUROSTAT 8 EU-SILC Variables

9 9 Guidelines – Target Variables (collected)

10 10 Guidelines – Target Variables (collected)

11 11 Guidelines – Target Variables (derived) (...)

12 12 Different variable vames but same labels

13 13

14 14

15 15 Check HH020 & HH021 (using flag-variables) HH021_F Flag Gesamt -5 m.v.of HH020 because HH021 is still used-1 missing1 filled HH020_F Flag -5 m.v.of HH020 because HH021 is used 011724517246 -1 missing1001 1 filled1499015002999 Gesamt150011874520246

16 DIFFERENCES BETWEEN DATA COLLECTED (as described in the guidelines) AND THE ANONYMISED USER DATABASE All income variables are in € (EURO) Variables removed Top/Bottom coding Variables added in addition: country specific rules 16 Additional important information

17 Names of variable added 1st character refers to the file (D; H; R; P) 2nd character ‘X’ 3 digits represent a sequential number e.g. HX040: Household size HX060: Household type HX080: Poverty Indicator (….) 17 Anonymised User Database – Variables added

18 18 Anonymised User Database – Variables added

19 Step 1: Open the 4 SPSS and/or Stata – Systemfiles Step 2: - Check the data How many households are included in the data (H- & D-File) total by country How many persons are included in the data (P- & R-File) total (any differences between the P- & R-File?) by country There are 15 countries in the training files. Fill in the table (next slide) What are the main differences across countries? Are there differences in the % of unemployed depending whether you use RB210 or PL031, why? 19 Hands on – Exercise 1

20 20 Exercise 1.3: Fill in the table Mean

21 21 Exercise 1.3: Fill in the table


Download ppt "Working with EU-SILC: data files, variables and data management Practical computing session I – Part 1 Heike Wirth GESIS – Leibniz Institut für Sozialwissenschaften."

Similar presentations


Ads by Google