Welcome to the conference –Rationale and timing The UKHLS and social science research The structure of the day –Morning: information and wave 1 update –Afternoon: looking ahead and getting your views
Background UKHLS is a longitudinal study based on a household panel design Basic design similar to that of British Household Panel Survey which it will replace Target sample size of 40,000 households Ethnicity strand – boost sample of 5 groups plus questions focussed on ethnicity related issues Biomedical strand Innovative data collection, data linkage etc.
Where we are now Spring 2007: PI team starts work, consultation launched June/July: initial meetings of topic groups and ethnicity strand consultation September: Topic groups reported September onwards: development of Innovation Panel questionnaire (NB different from wave 1 questionnaire) October: first meeting of Scientific Advisory Committee, proposals for topic content circulated December: measures for wave 1 identified January 2008: design of pilot begins; wave 1 innovation panel starts Active consultation process: so far more than 30 meetings, more than 200 written comments.
Key questionnaire constraints The following is now expected : –12 month intervals between interviews –Continuous fieldwork over 24 month field period, with second wave overlapping with first –Face-to-face interview at wave 1; mixed mode at wave 2, 20%+ face to face only –Individual interview not more than 30 minutes face to face interview administered, plus self completion and consents to link data –Household roster, plus 10 minutes household questionnaire –Some data collection by self completion from children aged 10-15 from wave 1 Questionnaire time for first two waves is short
Potential areas of coverage: Topic consultation groups 1.Standard of living measures (income, consumption, material deprivation, expenditure, financial well-being) 2.Family, social networks and interactions, local contexts, social support, technology and social contacts 3.Attitudes and behaviours related to environmental issues (energy, transport, air quality, global warming etc.) 4.Illicit and risky behaviour (crime, drug use, anti-social behaviour etc). 5.Lifestyle, social, political, religious and other participation, identity and related practices, dimensions of life satisfaction/happiness 6.Psychological attributes, cognitive abilities and behaviour 7.Preferences, beliefs, attitudes and expectations 8.Health outcomes and health related behaviour 9.Education, human capital and work 10.Initial conditions, life history
Key measures for scientific research Also useful to classify measures in terms of their place in longitudinal models which researchers develop –Outcomes –Preferences –Personal endowments and constraints –The wider social and spatial environment –Behaviours –Other variables (e.g. instrumental variables)
Outcomes Measures such as money income and consumption expenditure important for summarising growth in socio-economic well-being and changes in inequality and poverty. Need to be complemented by other measures, e.g.: –‘subjective’ measures of domain satisfaction and happiness, –non-financial measures of deprivation and hardship, –health (mental and physical), and –educational attainment in most general sense. These are important contributions to individuals’ ‘functionings’ (Sen)
Behaviours Panel research has been centrally concerned with the analysis of behaviour over time, and there is a strong case for extending the focus. Key areas include: –Work – market and non-market (including caring) – and pay –Health and lifestyle related behaviours e.g. smoking, exercise and diet, medications, pregnancy planning –Consumptions more generally, including their social and environmental impacts –Geographic mobility and (im)migration –Social, cultural, and political participation –Criminal, illicit, and anti-social behaviours –ICT usage, other media usage
Preferences Outcomes reflect the interplay of preferences, opportunities and constraints ‘Preferences’ include not only measures of intentions and stated preferences, also: –attitudes to risk and uncertainty, –perceptions, knowledge and awareness In a longitudinal context, focus on ex ante intentions, expectations, plans and aspirations, to see how they shape future behaviour, and to compare them ex post with outcomes. also a range of underlying psychological and personality predispositions (e.g. sense of control). social identities (e.g. related to ethnicity, religion, nationality, class, sex, age), and the behavioural norms associated with these identities.
Personal endowments and constraints A wide range of measures summarising the ability of individuals to realise desired outcomes. Person-level measures include physical and mental health including resilience, cognitive functioning, genetic endowments and biomarkers. The heading also refers to measures of a person’s human, social and cultural capitals, and of their social class and family socio-economic background. also measures of the ability to even out resources over time as needs fluctuate, e.g. measures of access to credit, help from friends, etc. Other measures of constraints on participation and functioning in contemporary society, e.g. access to transport, or particular forms of media and ICT.
The wider social and spatial environment What individuals can do also depends on the environment beyond the household in which they live. Data about ‘significant others’ outside the household, and interactions with them will be an important focus. Life chances may depend on resources from social networks outside the household; people maintain links with former household members after they have left. The characteristics of the local neighbourhood are arguably of substantial importance in shaping individuals’ lives, including: –quality of facilities (including housing, schooling, social services), –other environmental differences, ranging from air quality (for health) to prices of goods (for consumption).
Example model structure using different measure types
How do we fit everything we want into UKHLS? Research opportunities are enormous / time constraints are very severe –Need to be selective in what we include –Need to focus frequency of inclusion – measures may have to be collected intermittently –Maybe do not ask everyone all questions? –Note some data not collected by questionnaire, e.g. data linkage: can this substitute for questionnaire space? 10 principles for selecting measures in early waves…
Principles for selecting measures in UKHLS 1.Longitudinal survey: prioritise measures best used longitudinally, rather than just at a single point in time, or repeated cross-section. 2.Household survey: prioritise measures that benefit from understanding of the household context and measures from other family members. 3.Do not just duplicate other surveys. Prioritise new measures not covered elsewhere or where UKHLS design leads to benefits from replication. 4.Prioritise topic areas that address important and emerging long term scientific research agendas.
Principles (2) 5.Have patience! UKHLS represents every age cohort, and 1 st wave is not a baseline survey. Loss from delaying introduction of measures is not a failure to collect data at a particular age for the whole sample. 6.Successful establishment for the long term with low attrition is priority now. Minimise respondent burden and avoid measures which may damage response. 7.Derives from success of BHPS, and benefits from incorporating BHPS sample. But not a replication, so BHPS questions not carried unless they address a topic of continuing importance and no superior alternative.
Principles (3) 8.Multi-purpose survey providing a balance of coverage meeting wide range of needs; must not focus large share of questionnaire on a few measures. 9.Resource for UK social science: prioritises social science research agendas, including policy applications and agendas crossing traditional disciplinary boundaries e.g. related aspects of biomedical research. 10.Priority for topics which most benefit from co- existence on the same survey as other included topics. In particularly it is important to ensure that the design maximises the possibilities for cross- disciplinary research.
Peter Lynn Overview of sampling and other design issues
Sample components and sizes ComponentInterviewed Hhds (w1) Interviewed Hhds (w2) Innovation Panel1,5601,300 BHPS Sample8,3508,120 New General Population Sample 26,78022,200 Ethnic Minority Boost Sample 4,2103,300 Total UKHLS40,90034,920
Incorporating the BHPS Sample All BHPS sample households will be included in UKHLS Including Scottish & Welsh boosts and NIHPS To be administered standard UKHLS instruments, starting at wave 2 Temporal allocation not yet finalised
Sample Design New sample will be clustered in a sample of postcode sectors Equal probability sample of addresses in UK All persons resident at those addresses are sample members Subsequent to wave 1 interview: –all sample members are followed –all children born to female sample members become sample members –other members of households of sample members will be interviewed Household associates may also be interviewed
Data Collection Modes Wave 1 –face to face interviewing –Self-completion for 10-15 year olds –Telephone as last resort for refusal conversion Wave 2 –mixed modes: –telephone where possible; –face to face elsewhere Mixed mode approaches being tested on Innovation Panel Web under consideration for future
Components of the wave 1 questionnaire Annual repeating measures Initial conditions and life history, once only Rotating and intermittent measures first introduced at wave 1 Young persons questionnaire for sample members aged 10-15 The Topic Content paper presents settled plans, but some uncertainties remain (question timings and detailed question development)
Includes individual questionnaire and self- completion Wave 1Average for future waves Annual repeating questions 17.8 Initial conditions12.40.0 Rotating questions9.722.1 See table on page 7 of Topic Content paper for estimated distribution of the timings by measure type and subject area Estimated timings for questionnaire components
How proposals are presented Appendix B of Topic Content paper lists potential measures, grouped by theme and in the same order as in the Initial proposals paper (25 October) For each measure the table indicates: 1.Whether proposed for inclusion at wave 1 2.The proposed frequency of inclusion 3.When it is likely to be first asked 4.How we classify it as a measure type Last three of these are still very provisional
Stephen Jenkins Annual repeating measures introduced at wave 1
Principles: frequency of data collection Optimal frequency for any particular measure depends on: (i)the frequency of significant changes in that measure, and also in associated events that might explain them; and (ii)the quality of the information about the measure collected from of any specific survey instrument in relation to its cost Possibilities include: sub-annual, annual, biannual, less frequent
Principles (ctd.) Annual data collection appropriate when: 1.The dynamics of change per se (e.g. duration in states, factors explaining transitions from one state to another) are themselves interesting and 2.the phenomena themselves are subject to substantive change from year to year at the individual level, at least for significant fractions of the population Annual data collection is less appropriate where the interest is in long term impacts of earlier conditions, or where the time to impact is not of the highest priority
From principles to practice … Multi-measure surveys like household panel studies use a mixture of data collection intervals Mixture represents a compromise between: – optimal collection needs for specific measures, and – reductions in cost derived by clumping together collection in interviews (‘waves’) of regular periodicity Much analysis uses circumstances at the time of the interview to derive measures of change or frequency. Survey instruments also include retrospective histories covering the period between interview for relatively high frequency measures
From principles to practice …(ctd.) Existing research from around the world, including Britain (BHPS, LFS, various administrative data), suggests that higher frequency transitions and change refer to topics such as: –labour market participation, hours and earnings –receipt of various kinds of social security benefits –household consumption and income more generally –the onset of disability, and –other topics for particular groups, e.g.: developmental progress among children biological and associated changes during puberty health service use among elderly people
UKHLS Annual Repeating content The measures proposed reflect existing research which was reflected in, and supported by, contributions to the consultation The proposed annual content is …
UKHLS Annual Repeating content Basic demographic characteristics and changes, fertility, partnering, Health status (e.g. SF12), disability, Labour market activity and employment status, job search Current job characteristics, basic employment conditions, hours of paid work, second jobs Childcare, other caring within and outside household Income and earnings Life satisfaction Political affiliation – basic measures Transport and communication access Education aspirations and expectations Consumption expenditure Housing characteristics – basic Housing expenditure Household facilities, car ownership
UKHLS Annual Repeating content NB. Some annual repeating content will be introduced at Wave 2 –particularly relevant where Wave 1 establishes circumstances at the start of the panel, and this is updated at later waves Main topic areas are: –Activity history over previous year –Training and skill acquisition, qualifications obtained –Migration attitudes and behaviour
Heather Laurie Initial conditions and life histories
UKHLS will provide longitudinal data from the point at which sample members are selected at wave 1 Need data about people’s earlier life to fully exploit panel data in analysis Initial Conditions –factual background measures e.g. place of birth, details of parental background, qualifications Life History data –record all changes in a particular domain over the whole life-course to date e.g. cohabitation, marital and fertility history; an employment history; migration history; and many others
Outcome of consultation Best to carry these items at wave 1 if possible Items collected once in the life of the survey For respondents, is most natural place to collect this type of data If collected at later waves will disrupt the rotating sequence of other modules Provides some longitudinal data immediately for analysis Allows more time for design and development of new modules/questions for wave 2 and beyond
Critical areas Initial conditions: –Place of birth, national origins, family/parental background, education and qualifications Life Histories –International migration history, partnership history, fertility and childbirth history, employment status history, key previous job Non-trivial time constraints for collection of these data Timings from wave 1 Innovation Panel for some areas May not be able to carry all areas at wave 1
Youth Questionnaire 10 minute self-completion for 10 – 15 year olds Consultation on Wave 1 content still open Main design issues: –Does age range imply two versions of the questionnaire? E.g. for 10 – 12s and for 13 – 15s? –Relationship between the content of the youth and adult questionnaires i.e. comparable measures in each? –Transition from youth to adult questionnaire to maximise longitudinal analysis E.g. carry some youth questions for 16 – 19 year olds in the adult questionnaire?
Areas of coverage Need predictors of later outcomes as well as measures of current views or circumstances Establish which questions asked annually, which rotate in and frequency Potentially wide range of areas: –Relationships with family and friends –Social networks and illicit/risky behaviour –Experience of education and aspirations –Use of leisure time, health, diet and obesity –Future aspirations for job, family, independence –Social and political attitudes and values –Experience of harassment due to race or religion
Lucinda Platt Rotating and Intermittent measures
Rationale Use of rotating modules and intermittent measures for a substantial proportion of questionnaire time increases the range and number of questions that can be asked across the survey (note also subsamples discussion). A more infrequent cycle may, anyway, be more suitable for some measures. Some rotating modules will be included at wave 1, but far more in subsequent waves, allowing time for question development and further (and ongoing) consultation
Issues Trade-off between frequency and depth – more detailed modules may be feasible less frequently, more frequent measures may need to be sparing. Given less than annual occurrence, some questions are more suitable for higher frequency rotations (e.g. behaviours and outcomes), others can sustain longer periods between (e.g. relatively stable endowments and preferences).
Suggested frequency of broad topic areas Biennial Fuel consumption Mental health and well-being Tobacco, alcohol, drug use Physical activity, fitness, nutrition Financial/ material well-being Pensions and savings behaviour Commuting behaviour Work aspirations, preferences and expectations Domestic work Voluntary work Family networks outside household Travel behaviour ICT usage Leisure participation Attitudes and behaviour related to the environment 5-10 yearly Psychological attributes / stable values or preferences Cognitive ability 3-5 yearly Housing wealth Ethnicity and national identity Fertility intentions Chronic health conditions Sleep Obesity and body mass Wealth, credit and debt Employment conditions Within household organisation Social relationships within the family Religion Social and friendship networks Social support Political engagement Social engagement, social capital Local neighbourhood Quality of life measures Discrimination and racism Cultural consumption
Issues (continued) Some modules are complementary and are best asked in the same wave – co-ordination as well as frequency then becomes an issue. Some questions might be most salient for particular sub-populations – e.g. particular age groups; or may be suitable for higher frequency at particular ages Some questions can be related to events that may occur in people’s lives and can be asked just at those points. Some questions or topic areas still need development – timing of introduction is in part dependent on that. ‘Extra five minutes’ in the ethnic minority boost could allow additional topics or greater frequency for especially salient ones (or a bit of both)
What you can tell us… What is the right frequency for particular modules and measures? (in the earlier slide and Appendix B – have we got it about right?) Why? What topics need to be asked at the same time? Why? What questions could be targeted at specific age groups? Why? What event triggered questions should we be asking (other than birth weight following a birth and questions following a move)? Why? What are the best measures for some of these topics? At what wave should the measure be introduced? Why? Some areas clearly need question development – can you contribute to that?
Rationale for use of sub-samples For many, but not all, purposes 40,000 households is larger than needed Could exploit this by creating random sub-samples which contain both questions asked of everyone and subsets of questions asked only in the sub-sample This increases the effective length of the questionnaire, and potentially allows inclusion of more modules or permits modules to be included with greater frequency Contrast this with focussed studies on small non- random sub-groups with particularly relevant characteristics: potential for attrition bias?
Key issues for sub-samples Ensuring that right combinations of questions are on same sub-sample – major design challenge Combine ‘light’ measurement of topics for full sample, with greater depth or higher frequency for sub-sample? Issues for ethnic minority boost sample, and separate analyses of e.g. Scotland, Wales and Northern Ireland Context effects? NB: we are not proposing to introduce sub- sampling at wave 1
Different approaches to sub-samples 1.Completely distinct question groups: requires groups of questions where can assume that there will little demand for analysis combining data from more than one group. Implies thematically coherent groups. 2.Overlapping question groups in different sub-samples. Ensures that every pair of questions is asked in combination for a random sub-set of respondents. The time gains are less, but it might cover a higher proportion of questions. 3.Randomise the allocation of questions at the individual respondent level. Statistical benefits but complex for analysis.
Sample 1Sample 2Sample 3Sample 4Sample 5Sample 6 Questions for respondents in all samples, including annual measures and rotating and intermittent measures Questions for specific sub-samples: Group A Group B Group C Group D Example question groups: Group A: Environmental attitudes and related behaviour Group B: Employment conditions, time-use, work-life balance Group C: Health related, risky and illicit behaviours Group D: Social and political engagement Potential design for random sub-samples in UKHLS (approach 2)
Implications of using this approach 50% of whole sample eligible for every question Sub-sample eligible for each two-way combination larger than BHPS original sample (> 5,000 households) Sub-samples defined longitudinally: questions from the same group would be asked at each wave, mainly intermittent, but might be possible to have more annual questions in sub-sample Current view is that if we used this approach ethnic boost samples would also be split Potential for other overlapping patterns, e.g. 5 question groups with each 3-way combination per sub-sample
Questions for discussion What views do you have of the advantages and disadvantages of use of sub-samples as suggested here? Are there other approaches we should consider? Which questions could be restricted to sub- samples? Which questions must be asked of the whole sample?
Where do we go from here? Group 1: LTB4 (downstairs) with Alita Group 2: LTB4 (downstairs) with Noah Group 3: ISER large seminar room (in ISER building) with Mark Group 4: ISER large seminar room (in ISER building) with Gundi Group 5: ISER seminar room foyer (in ISER building) with Birgitta Group 6: ISER seminar room 4.08 (in ISER building) with Jon
Birgitta Rabe and Nick Buck Other forms of data collection
(1) Data Linkage Purposes of data linkage –Supplement data collected in survey –Substitute data collected in survey –Validate data collected in survey –Survey administration
Types of data linkage Individual level Organisation level Area level
Individual level Benefits, earnings, taxes, government schemes (DWP/HMRC records) Education and educational attainment (DCSF and others) Health: hospital episodes, births, deaths, cancer (HES, NHSCR)
Area level Linkage based on geo-codes, e.g. LAD, PCT, LEA, SOA, grid references Wide range of geo-coded data available, e.g. social and economic characteristics, environmental quality
Issues in data linkage Best timing of obtaining consents for individual- level linkage given burden and sensitivity issues User access plan
(2) Biomarkers and health indicators The UKHLS intended to be at the frontiers of research on social, demographic, behavioural and health sciences. The study needs to engage with the rapid advances in the biological and life sciences Aim is to seek funding for collecting an integrated package of biomarkers and key related indicators Intention is collection of biomarker information by minimally-invasive methods that can be carried out by survey interviewers with appropriate training
Development of biomedical strand ESRC commissioned John Hobcraft to identify key potential issues and priorities Detailed consultation is needed on which measures have the highest salience. This will involve UKHLS Biomedical and Health Indicator Advisory Committee (chair Michael Rutter), and Warwick colleagues on the UKHLS team (Dieter Wolke and Scott Weich), along with ESRC Consultation will focus on ethical issues and acceptability of this data collection for respondents Data collection requires additional funding and will start at wave 3 at the earliest.
(3) Qualitative and other data from respondents UKLHS has established links with qualitative longitudinal research and needs to facilitate mixed methods research Opportunities in the following areas should be pursued : –A range of qualitative sources, including unstructured interviews, biographies, audio and video diaries and other forms of visual evidence –Free text from structured questions –Platform for linked qualitative studies –Structured diary data –Experimental data Plans in this area still being developed
Consultation will be continuing We will make available drafts of wave 1 questionnaires as soon as possible We will circulate more detailed plans for future wave content, including both more specific indication of measures to be included, and timing of inclusion; Also more detailed plans for sub-sampling to test potential problems We will be approach you for advice on the design of questions in specific areas.
Other issues for the future Data release: wave 1 completed in Autumn 2010, but potentially release dataset based on first year fieldwork, during 2010 We will consult on data structures, documentation requirements, other support users may need Some data may be more confidential than others (e.g. from data linkage). Likely to be a differentiated release policy e.g. –most data available from ESDS using normal license (like BHPS), –some using special license arrangement, –some requiring analysis in a secure setting (ESRC has plans for these)
Timetable January 2008Consultation on wave one content concluded; consultation on future waves continues. January 2008Wave One of innovation panel starts Spring 2008Consultation on Biomarker and Health indicator component starts June 2008Final survey pilot for wave one June/July 2008 Design work on wave 2 innovation panel starts January 2009Start of wave one main fieldwork
Thank you for coming and keep in touch! The UKHLS Team http://www.iser.essex.ac.uk/ukhls/ Email us at: firstname.lastname@example.org