Presentation on theme: "Housekeeping Fire alarm: LOUD continuous ringing Turn right down corridor Down stairs Gather on Oxford Road side of building Mens and Womens toilets Turn."— Presentation transcript:
Housekeeping Fire alarm: LOUD continuous ringing Turn right down corridor Down stairs Gather on Oxford Road side of building Mens and Womens toilets Turn right, toilets at end of corridor
Using the hierarchy of the government surveys Jo Wathan Centre for Census and Survey Research Economic and Social Data Service (Government Data)
ESDS Using Hierarchy: v.06/043 ESDS Government Part of the wider Economic and Social Data Service, ESRC funded data dissemination and support service. ESDS is headed by UK Data Archive, also involves MIMAS and CCSR at the University of Manchester and ISER at the University of Essex ESDS Government, headed by CCSR. Supports the large scale, continuous, cross- sectional surveys collected by ONS and NatCen Data dissemination carried out by UKDA Value added services and user support carried out by CCSR
ESDS Using Hierarchy: v.06/044 This afternoon… What is hierarchical data? What is the research purpose of hierarchical data? What hierarchy is available in ESDS Government datasets? Working with hierarchy in SPSS and Stata Practical exercise
ESDS Using Hierarchy: v.06/045 What is hierarchy? Data which can be analysed at more than one level, where smaller levels are nested within higher levels Most commonly seen in the form of household data, where information is collected on all individuals within the household –Data contains a variable indicating which household an individual lives in –Data can be analysed at the household level or the individual level –Often possible to analyse at the family level too Other forms of hierarchy available, eg. Sub- individual level (e.g. information per hospital stay, per crime reported)
ESDS Using Hierarchy: v.06/046 Compared with flat files… Contextual information may be present, e.g. individual asked about size of household but: –Information collected from only one level –Not usually appropriate to use data at other levels –Not usually possible to create additional derived variables at other levels –E.g. information collected from one individual within household
ESDS Using Hierarchy: v.06/047 Hierarchical data: conceptually Household 1 North West Social rented Person 1 HoH Female 28 GCSE P/T Work No LTILL Person 2 Son of HoH Male 12 N/A No LTILL Household 2 Wales Owner occupier Person 1 HoH Male 33 Degree F/T Employee No LTILL Person 2 Spouse of HOH Female 31 Degree P/T Employee No LTILL Person 3 Parent of HoH Female 72 No quals Econ Inactive LTILL
ESDS Using Hierarchy: v.06/048 More complex hierarchy… Household 1 Family 1 hohSon of hoh In patient 1 Household 2 Family 2 HohWife of hoh Family 3 Mother of hoh In patient 1 In patient 2
ESDS Using Hierarchy: v.06/049 What does the data look like? Flattened data (GHS)
ESDS Using Hierarchy: v.06/0410 What does the data look like (2) Multiple tables (FES) Household.por Jobmain.por
ESDS Using Hierarchy: v.06/0411 Use the hierarchy to… Better describe the household Describe the household context of an individual Look at intra-household differences (& sameness)
ESDS Using Hierarchy: v.06/0412 Describing the household e.g. Is the household deprived / in poverty? Equivalising income (e.g. FRS) –Need information on total income (all members not just Household Reference Person) –Need information on household composition Identifying workless households –E.g. Gregg and Wadsworth (1999)
ESDS Using Hierarchy: v.06/0413 Source: Richard Dickens, Paul Gregg and Jonathan Wadsworth (2000) New Labour and the Labour Market, CMPO Working Paper Series 00/19 Table 5
ESDS Using Hierarchy: v.06/0414 The effect of partnership on employment (mothers)
ESDS Using Hierarchy: v.06/0415 Ethnic homogeneity - % hhold members in same ethnic group as HOH Source 1991 Household SAR
ESDS Using Hierarchy: v.06/0416 Hierarchy in some key datasets Survey Hhd hierarchy? LevelsType GHS Household,Family, Individual,Sub Individual Flat file LFS Household, Family, Individual Flat files (QLFS/Hhd data) FES Multiple, inc. household, person, family unit, benefit unit Multiple files FRS Household,Benefit Unit, IndividualMultiple files HSE Household, Individual (watch out for variable samples) Flat files (1 all inds, 1 all resps) BSASIndividualFlat file BCSIndividual,Incident (Hhd context only) Multiple files BHPS Household, Individual (& below)Multiple files Household SARs Household, Family, IndividualFlat file
ESDS Using Hierarchy: v.06/0417 Main Levels Household –group who have the accommodation as their only or main residence and who either share one meal a day or share the living accomodation. –Useful for coresidence or policy related issues Family Unit –An individual plus partner plus any unmarried children –The census definition of family unit excludes single childless individuals –Useful for identifying partnership and parenthood relationships Benefit Unit –Adult children in separate unit from parents –Useful when considering income and benefits Check your definitions (despite harmonisation)
ESDS Using Hierarchy: v.06/0418 Identifying the units You will need a unique identifier for the unit at each level Several variables may be needed to be used in combination You may need to compute a unique identifier Will need to read the documentation to assess this
ESDS Using Hierarchy: v.06/0419 Straightforward: GHS 00-01 To identify a household use HSERIAL To identify an individual within the household use PERSNO To identify a family unit use FSERIAL To identify a family unit within a household use AFAM To identify the household reference person test for PERSNO = HRP (HRP gives the person no. for the HRP) Similarly to locate the Family Unit head test for FUH=PERSNO
ESDS Using Hierarchy: v.06/0420 Complex e.g. QLFS 2003 If interested in using household information use the Household File Information about identifiers is in the read file Household identifier is Remserno – however this is not present in all LFS datasets To compute use: –Week x 10000000 + –W1yr x 1000000 + –Qrtr x 100000 + –Add x 1000 + –Wafnd x 100 + –Hhd This has to be used together with either CASEID or QUOTA (which are identical) – could combine this with Remserno to derive an easier to use household ID To identify a person in the household use person
ESDS Using Hierarchy: v.06/0421 Working with hierarchical data Which level should I analyse at? Manipulating data in SPSS –Menu driven approach –Syntax Manipulating data in Stata
ESDS Using Hierarchy: v.06/0422 Which level should I analyse at? Hhd ID Person number Relationship to HRP familyIncome p/w agetenurehealthRelation- ship to FUH FUH HidPersonReltohrpFamIncAgeTenureHealthReltofuhfuh 11self1dna63Soc rentPoorSelfyes 21self130021Priv rentGoodSelfYes 22none240028Priv rentGoodSelfYes 23none310019Priv rentOkSelfYes 31self170043Own occGoodSelfYes 32partner150040Own occGoodPartnerNo 33child1N/a12Own occGoodChildNo 41self120035Own occGoodSelfYes 42partner19034Own occOkPartnerNo 51self145025Soc rentOkSelfYes 52child1N/a4Soc rentPoorChildNo 53child1N/a2Soc rentokchildno
ESDS Using Hierarchy: v.06/0423 Understanding the data What is the default case/unit of analysis in the dataset? How many cases are in the data? How many households are in the data? How many family units are in the data? How many households have more than one family unit? How large is the largest household? How many lone families are in the data?
ESDS Using Hierarchy: v.06/0424 Using the data What unit of analysis would you use to answer the following questions? Would you need create variables at different levels of analysis to answer the question? –What is the mean income per adult? –What proportion of children live with 2 parents? –What is the mean income per adult-equivalent household member (where children count as half a household member)? –Does your partners health affect your own? –How is total household income related to tenure?
ESDS Using Hierarchy: v.06/0425 Working with hierarchy in SPSS SPSS is not good at data manipulation! To generate a household variable from individual data need to use the aggregate command. Aggregate command creates a household level file, with: –1 case per household –Contains the household ID variable specified plus any aggregate variables defined Slow, memory intensive, unnecessarily complicated compared with some other packages…
ESDS Using Hierarchy: v.06/0429 Aggregation at the household level You can work at the level of the household –Use the aggregate outfile –Remember to carry across other household level variables that you will need into the aggregate file as part of the aggregate procedure Or match the household level variable back to the original individual level dataset…
ESDS Using Hierarchy: v.06/0430 Aggregate and match back to individual file Usually it is best to match back your aggregated variable to the master file –the household variable is distributed to each individual –you can then select on household head, family head to work at level of household or family –Or you can link information about the household to the individual
ESDS Using Hierarchy: v.06/0432 SPSS syntax used *compute a variable which is a low value, but which takes the (higher) value for health when respondent is hrp. compute hlthrep = -9. if (reltohrp = 1) hlthrep = health. crosstabs hlthrep by health by reltohrp. sort cases hid. aggregate outfile = "c:\work\esds\aggfile.sav" /break hid /nperhh = n(hid) /oldest = max(age) /hrphlth = max(hlthrep). execute. match files /file = * /table = "c:\work\esds\aggfile.sav" /by hid. execute.
ESDS Using Hierarchy: v.06/0433 Working with hierarchy in Stata Stata much better at data manipulation than SPSS Not necessary to create an additional file Simply run the appropriate procedure for each household separately –Sort the data by the household identifier first –Use the by household identifier subcommand
ESDS Using Hierarchy: v.06/0434 The equivalent Stata commands: sort hid egen nperhh = count(hid), by (hid) egen oldest = max(age), by (hid) gen hlthrep = -9 replace hlthrep=health if (reltohrp == 1) egen hrphlth = max(hlthrep), by (hid)
ESDS Using Hierarchy: v.06/0435 Some issues… Is the data representative for your choice of unit? –Looking at individuals in a household survey will generally omit individuals not living in households –Weighting may be necessary to counteract survey design –If the survey was not designed to analyse using the units you use, will it still be representative? Will there be any clustering effects? –Individuals within households will be more alike than individuals in general –This could affect the accuracy of the estimates
Your consent to our cookies if you continue to use this website.