Presentation on theme: "Adding Geographical Information to Longitudinal Studies Vernon Gayle, Paul Boyle, Andrew Cullis, Zhiqiang Feng, Robin Flowerdew"— Presentation transcript:
Adding Geographical Information to Longitudinal Studies Vernon Gayle, Paul Boyle, Andrew Cullis, Zhiqiang Feng, Robin Flowerdew
n ESRC funded project commissioned by the Institute for Social and Economic Research (ISER) via the National Longitudinal Strategy Committee n To examine the introduction of local area geographical variables to longitudinal datasets
Today’s Talk is Designed to… n Outline the project n Briefly highlight some issues n Present some of the recommendations n Facilitate discussion n Draft report on RM website n Welcome Comments
Motivation: The recognition that the investigation of the influence of locale on individuals may currently be hampered by the lack of geographically refined information included in the major longitudinal datasets.
Focus of the Project – Large Scale Longitudinal Studies n The British Household Panel Study (BHPS) n The Birth Cohort Studies National Child Development Study (NCDS) Birth Cohort Study (BCS70) Millennium Cohort Study (MCS) n The ONS Longitudinal Study (ONS-LS) The Scottish Longitudinal Study (SLS)
Consultation Methodology n Views of data users (geographers & other social scientists) n Views of Principal Investigators and research teams working on the longitudinal datasets n Views of gatekeepers and those associated with accessing social science datasets n Open discussions n Interviews n Focus Groups (Note these are all cross-sectional data sources!)
Existing Geographical Information in the Longitudinal Datasets There is considerable variation between the longitudinal datasets regarding the geographical information that currently exists. We would expect this because these studies have different histories (including funding), designs and scientific foci.
British Household Panel Survey (BHPS) n Local Authority Districts (n=278) – SAR Areas
1991 SAR Areas In Britain
British Household Panel Survey (BHPS) n Small number of Neighbourhood Variables.
The Older Birth Cohort Studies n NCDS – Regional Health Authority; Census Small Area Statistics (1971 and 1981) n BCS – Regional Health Authority; District Health Authority; Local Education Authority
Millennium Cohort Study (MCS) n Complex Design Put simply…. The MCS has a geographically clustered sample. Certain analyses require this to be explicitly represented. Therefore an ‘anonymous’ ward level indicator is provided. Limited amount of geographical information (e.g. Country and Region of birth).
ONS & Scottish Longitudinal Study n These are UK Census based studies – They include a great deal of geographical information. The ONS – LS is not generally released and has its own set of arrangements as will the SLS.
The Headline Message n There is overwhelming support from academic Geographers. n There is wide support from researchers from other social science disciplines.
The Headline Message n The Principal Investigators and research teams associated with the major longitudinal studies are also generally in support. However, they are understandably cautious because of the problems associated with confidentiality, identification and potential disclosure.
Adding Geographical Information n Geographical Identifier (e.g. exact location of place) n Geographical Variable (e.g. tell us the type of place)
Adding Geographical Information n Information collected by the survey (e.g. neighbourhood satisfaction) n Information added from a source external to the survey (e.g. local unemployment)
Adding Geographical Information n Cross-sectional measures (e.g. a measure based on 1991 Census information) n Longitudinal measure (e.g. crime statistics [annual]; unemployment claimant count [monthly])
Confidentiality, Identification & Disclosure Risks Inevitable risk that individuals (or households) could potentially be identified and this could lead to information being disclosed. This is a problem for cross-sectional datasets but becomes more acute for longitudinal datasets.
Adding Geographical Variables Adding a Single Continuous Variable. Ward level unemployment rate.
UNIQUE WARDS Two decimal places One decimal place No decimal places National Level Regional Level SAR Area Level
MAIN RECOMMENDATIONS n We recommend that standard geographical identifiers below SAR-area should not be added to the existing longitudinal data.
Confidentiality, Identification & Disclosure Risks THIS IS FICTIONAL & PURELY ILLUSTRATIVE – MIS-USE OF DATA WOULD CONTRAVENE EXISTING AGREEMENTS & RULES. To our knowledge, thankfully, no such contravention has ever occurred in the UK that has led to unauthorised information being disclosed.
Confidentiality, Identification & Disclosure Risks Consider a HOUSEHOLDS PANEL STUDIES (single wave) similar to existing studies.
Variables that can be downloaded GenderMale Year of Birth1967 Month of BirthOctober Occupation (SOC)Teaching Professional Highest QualificationHigher Degree Size of workplaceOver 1,000 Local Authority DistrictStirling EthnicityCaribbean
Variables that can be downloaded GenderMale Year of Birth1967 Month of BirthOctober Occupation (SOC)Teaching Professional Highest QualificationHigher Degree Size of workplaceOver 1,000 Local Authority DistrictStirling Type of HouseholdCouple: No children Number in Household2 Type of AccommodationDetached House Number of Bedrooms2
Information from other household members can easily be downloaded GenderFemale Year of Birth---- Month of BirthJune Occupation (SOC)Teaching Professional Highest QualificationHigher Degree Size of workplaceOver 1,000 Local Authority DistrictStirling
Confidentiality, Identification & Disclosure Risks Remember this is a SINGLE WAVE of data. Characteristically, in HOUSEHOLDS PANEL STUDIES there will be multiple waves of data thus compounding this problem.
Sensitive Information in Survey
SAFE SETTINGDOWNLOAD DATA SECURITY LOWERHIGHER HIGHERLOWER ACCESS
SAFE SETTINGDOWNLOAD DATA Potential Costs HIGHERLOWER ££££££££££££££££££
SPECIAL DOWNLOADS SECURITY & ACCESS
DOWNLOADABLE DATA SETS n A small number of newly developed bespoke variables should be created and added to the datasets e.g. Deprivation Score; Urban/Rural identifier n Comparable across studies n Undisclosed methodology
SPECIAL DOWNLOADS n Specific substantive projects e.g. parliamentary constituency n New arrangements e.g. licenses, obligations, security data, destruction n Limitation on sensitive information and data linkage
SAFE SETTINGS n Establishment of a small number of safe settings n New licenses, agreements, obligations n Visiting & remote job access (SLID, LUX, LS) n Monitoring of use of sensitive information n Collaboration with National Statistical Offices