Presentation on theme: "ESDS Using working with surveys: v.10/07 1 Further Applications of Linking and matching Anthony Rafferty & Jo Wathan Economic and Social Data Service (Government."— Presentation transcript:
ESDS Using working with surveys: v.10/07 1 Further Applications of Linking and matching Anthony Rafferty & Jo Wathan Economic and Social Data Service (Government Data)
ESDS Using working with surveys: v.10/07 2 Other linking applications: 1) Complex datasets: File linking within and over hierarchy across data files (across a database) 2) Pooling to form repeated cross-sectional datasets 3) Combining panel survey waves Final Practice: Two exercises – a) GHS (simple) and b) Family Resources Survey (slightly more advanced)
ESDS Using working with surveys: v.10/07 3 1) File Linking across a database In some datasets different info/ levels of hierarchy are stored in separate data files e.g.: –Family Resources Survey –British Crime Survey (BCS) –Family Expenditure Survey (FES) –British Household Panel Survey (BHPS)..so using hierarchy requires linking and combining info from different files.
ESDS Using working with surveys: v.10/07 4 Example: Family Resources Survey A continuous, cross-sectional, voluntary survey 28,000 Private Households in U.K. –Northern Ireland added to the Survey in 2002- 03 Fieldwork by consortium of ONS and NatCen
ESDS Using working with surveys: v.10/07 5 A typical FRS Year Database………………………
7 Terminology The complete collection of files for a given year of a survey is often referred to as a Database Individual files are often referred to as Tables (think of it as tables of micro- data) We will still refer to id variables as linking variables or keys
ESDS Using working with surveys: v.10/07 11 Linking across Hierarchy in SPSS Sort both datasets by linking variable first Merge/ Add variables command V14 onwards allows you to simultaneously open more than one dataset at a time Definition: The key /look up table
ESDS Using working with surveys: v.10/07 13 Linking Across Hierarchy in Stata Sort by linking variable (s) Merge command Creates variable _merge : tabulate _merge –1=master dataset (that in memory) 2= using dataset; 3= case in both datasets
ESDS Using working with surveys: v.10/07 14 2) Pooling repeated cross- sectional datasets Repeated Cross-sectional: Multiple measurement time points, but different people interviewed at each time point (so independent samples) Most ESDS Government Datasets Special Cases: –General Household Survey (pre- 05) –LFS (has 5 quarter panel element as well)
ESDS Using working with surveys: v.10/07 15 SurveyRepeated cross- sectional Longitudinal element LFS1992 onwards GHS2005 onwards FRS EFS TUS2000 (2005 in Omnibus) BSAS1984-1986 Omnibus (modules) APS NTS BCS HSE SEH Definitions Cross-sectional: one point in time Repeated cross- sectional: survey repeated (each year) on different samples True longitudinal: same people at multiple points in time Retrospective Types of data
ESDS Using working with surveys: v.10/07 16 Why Pool data over time? Increase Sample Sizes, reduced standard errors Examine trends over-time Include year specific controls in regression models (e.g. year dummy, regional unemployment rate)
ESDS Using working with surveys: v.10/07 17 Change in vehicle ownership over time Source: GHS
ESDS Using working with surveys: v.10/07 18 Pooling Data Merge add cases in SPSS Append Command in Stata
ESDS Using working with surveys: v.10/07 19 3) Combining panel survey waves Same individuals interviewed at different waves Cross-sectional (i) and time-series (t) dimension Often stored as separate wave files: –E.g. British Household Panel Survey (BHPS) Same linking commands can be used to join the files
ESDS Using working with surveys: v.10/07 20 Long and Wide Format Appendix E of workbook
ESDS Using working with surveys: v.10/07 21 Exercises FRS Exercise: Using data from three levels of hierarchy across three data tables GHS Exercise: Pooling years of repeated cross-sectional surveys (easier)
ESDS Using working with surveys: v.10/07 22 FRS Exercise What percentage of people in London, the East-Midlands, and West-Midlands are claiming state retirement pensions? Method: Need to Combine three files at different levels of hierarchy: HOUSEHOL ADULT BENEFITS Then run the cross-tab syntax at the bottom. If you do the data linking right, you get the right answer..
ESDS Using working with surveys: v.10/07 23 General Household Survey (GHS) Exercise How does the age of the UK population vary by ethnicity? Estimate the average age of different ethnic groups as coded in the variable ethnigp2 Pooling three years of GHS Data Effects of pooling on sample size and estimation
ESDS Using working with surveys: v.10/07 24 Units of analysis Fundamental to your research question! –Who do you want to generalise to? –What are your cases? –What units are your population composed of? –Who is your research question applicable to? Some typical units –Individuals –Households –Schools –Businesses –Farms –Doctors –Wards
ESDS Using working with surveys: v.10/07 25 Hierarchy in some key datasets Survey Hhd hierarchy? LevelsType GHS Household,Family, Individual,Sub Individual Flat file LFS Household, Family, Individual Flat files (QLFS/Hhd data) FES Multiple, inc. household, person, family unit, benefit unit Multiple files FRS Household,Benefit Unit, IndividualMultiple files HSE Household, Individual (watch out for variable samples) Flat files (1 all inds, 1 all resps) BSASIndividualFlat file BCSIndividual,Incident (Hhd context only) Multiple files BHPS Household, Individual (& below)Multiple files Household SARs Household, Family, IndividualFlat file
ESDS Using working with surveys: v.10/07 26 Quarterly Labour Force Survey Spring quarter Summer quarter Autumn quarter Winter quarter Spring +1 Quarter W112k W212k W312k W412k W512k Purple indicates those cases who were in wave 1 in spring year 1 – i.e. theyre in wave 2 in summer etc Each household participates for 5 consecutive waves (every 3 months/quarter) Total 60k households per quarter
Your consent to our cookies if you continue to use this website.