Presentation on theme: "Centre for Market and Public Organisation Measuring socio-economic position in ALSPAC Liz Washbrook, CMPO ESRC/ALSPAC Large."— Presentation transcript:
Centre for Market and Public Organisation Measuring socio-economic position in ALSPAC Liz Washbrook, CMPO Liz.Washbrook@bristol.ac.uk ESRC/ALSPAC Large Grant Meeting 5 th November 2008
But first! US cohort studies Early Child Longitudinal Study – Birth Cohort (ECLS-B) 10,000 children born 2001, nationally representative when weighted Over-samples of low birth weight babies, twins, some ethnic groups (e.g. Native Americans, Chinese) Samples from birth certificates, follow-ups at 9 months, 2 years, Fall prior to kindergarten (~4y), Fall of kindergarten year (~5y). But no more! Data from parent CAPI, direct child assessments, child care providers and teachers. Some resident and non-resident father questionnaires. Early Child Longitudinal Study – Kindergarten Cohort (ECLS-K) 20,000 children starting kindergarten in 1998 (b. 1992/3) Children sampled from 1277 schools in 100 counties. Target 24 children per school. Nationally representative when weighted. Follow ups at Fall & Spring kindergarten year (~5-6y), Fall & Spring 1 st grade (~6-7y), Spring 3 rd grade (~9y), 5 th grade (~11y), 8 th grade (~14y) Data from direct child assessments, parental phone interviews, teacher and school administrator questionnaires. Data is publicly available (on CD). See http://nces.ed.gov/ECLS/index.asp
US cohort studies Fragile Families 5000 children born 1998-2000 in large US cities Designed to follow children born to unmarried parents but includes control sample of married parent families (~25%). Focus on deprived families – 44% mothers at baseline black, 35% Hispanic, 27% teenagers, 79% high school or less Detailed information on fathers roles and involvement Parent interviews in hospital at birth, follow ups at 1, 3, 5 and 9. Includes direct in-home child assessments. Data publicly available: www.fragilefamilies.princeton.edu/index.asp
Aims Aim to stimulate discussion about the construction of an index of parental socio-economic position (SEP) from the ALSPAC data Talk will cover The range of indicators available and their features Sample selection/missingness issues (multiple imputation) Combining the indictors into a single index (principal components analysis) Illustrated using a case study: Measures of social inequality in Key Stage 2 exam results (age 11) Would a standard SEP variable available to all ALSPAC researchers be useful? If so, how should it be constructed? Input, feedback, discussion would be appreciated!
What is SEP? Extensive literature on theories of social stratification (Galobardes, Lynch and Davey Smith, 2007; Bradley and Corwyn, 2002). Socially derived economic factors that influence what positions individuals or groups hold within the multiple-stratified structure of society (Galobardes et al) In practice researchers have used a multitude of individual indicators to measure SEP, each of which captures a different aspect of stratification Composite SEP is a relative measure, whereas some indicators (income, education) measure absolute levels of resources. This may have implications when thinking about policy.
Why measure parental SEP? SEP as a summary measure of family background that defines sub- groups of the population. Social mobility/life chances Nature vs. nurture Example: Joint CMPO project on the role of attitudes and aspirations in explaining the educational deficits of children in poverty SEP as a way of capturing long-term access to resources over the life course, e.g. permanent income in economics To classify deprived or vulnerable groups in a way that captures the idea of multiple risks As a control for confounding influences (e.g. studying the effects of smoking)? Disaggregated sets of control variables may be more appropriate
SEP indicators in ALSPAC Included in the index: Income Education (mother and father) Social class (mother and father) Housing tenure Local deprivation/affluence Subjective financial hardship Excluded: Wealth Employment status Race/ethnicity Family structure How is the indicator constructed from multiple pieces of information? (High frequency of measurement in ALSPAC) How is the indicator distributed? (E.g. discrete/continuous) For whom is it available? (Differential missingness) How well does it distinguish between high- and low-performing children? (KS2 is an example – relationships will differ with different outcomes)
The sample 11 071 children with: A valid Key Stage 2 score Minimum of 2 (out of 10) non-missing SEP indicators (30% complete cases) Sample is 69% of the eligible birth cohort (15 994 in NPD) Key Stage 2 score derived from exam marks in English, maths and science in Year 6 (age 11). National tests compulsory in all state schools. Test scores are averaged and normalised to mean zero, standard deviation 1 on the full eligible population of 15 994 The working sample is not randomly selected Mean KS2 (SD) Working sample (N=11071)0.11 (0.95) <2 SEP indicators (N=4923)-0.26 (1.05)
Household income Measures: Take home weekly family income at 33, 47, 85, 97 months; 11 years £ per week33 mths47 mths85 mths97 mths <1008.77.84.02.1 100-19917.715.811.39.2 200-29928.426.218.416.6 300-399184.108.40.2061.1 >40024.028.244.050.9 N8832865575257037 Proportion of valid responses in bands: Failure to update the bands means that the usefulness of the 85 and 97 month income measures is limited.
Household income The age 11 income measure is better: £ per weekValid % < £1202.3 £120-1895.2 £190-2395.5 £240-2897.0 £290-35911.7 £360-42911.0 £430-4797.1 £480-55915.3 £560-79920.6 >£80014.2 N6552
Household income The SEP index uses: Log average real equivalised weekly take home income at 33 & 47 mths Median income for band imputed using FES data for households containing a child of the cohort members age, in the relevant year and income interval Adjustment made for housing benefit income if respondent reports zero housing expenditures and lives in rented accommodation (predicted value from FES for HB recipients in the Southwest, varying with year, lone parent status and number under 16s in household) Expressed in 1995 prices using All Items RPI Equivalised using modified OECD scale Averaged and logged Nominal banded income at 85 months Nominal continuous income at 11 years, using band midpoints
Average KS2, by nominal income quintiles at age 11
Parental education Measures: Mother and partner reports for both spouses qualifications: antenatal, 61 and 97 months. The SEP index uses maternal reports of own and partners highest qualification at 32 weeks gestation. Issues Non-response to the question is coded as no qualifications (dont know, no quals and no partner were all possible responses) Possible discrepancies between own and partner report Possible changes in the identity of the partner over time Possible changes in qualifications over time
Parental social class Measures: Mother reports of own and partners occupation: antenatal, 8 and 97 months. Partner reports more frequent but not coded. The SEP index uses maternal reports of own and partners social class at 32 weeks gestation. Question related to occupation in current or last job Occupations coded according to 1991 SOC classification Used to derive Registrar Generals Social Class – this is what is available in the datafiles. Hierarchical measure. No other data on occupation is currently coded
Housing tenure Measures: Mother reports of tenure: 8, 21, 33 and 61 months. The SEP index uses a derived variable Always owner-occupier – mortgaged/owned outright/buying from council at all 4 dates Ever in social housing – council rented/Housing Association rented at any of 4 dates Other – not otherwise classified and at least one valid response (other responses: private rented furnished/unfurnished, other). Includes all people with a missing value who were never observed in social housing, as well as renters.
Local deprivation/affluence Measures: Ward-level Index of Multiple Deprivation (IMD) currently matched at birth, age 5 and age 8, but postcodes available on an annual basis The SEP index uses the (continuous) rank of the IMD for ward at birth IMD provided by government statistics. Derived from data in 6 domains: income, education, employment, housing, health, access to services Wards in England (approx. 5500 individuals) ranked on basis of deprivation from 1 to 8414. This allows definition of national quantiles. Can be matched to ALSPAC via postcode data
Subjective financial hardship Measures: Mother-completed financial difficulties questionnaires at 8, 21, 33, 61 and 85 months Format: How difficult at the moment do you find to afford these items: food; clothing; heating; rent/mortgage; things for child? Very (3); Fairly (2); Slightly (1); Not difficult (0) Responses to the 5 items at each date summed to give to score between 0 and 15 The SEP index uses the mean score across the 5 dates The 61 and 85 month measures include questions on educational courses, medical care, child care and other things Do not pay for this/DSS pays options for rent and heating coded as 0 The distribution of the resulting variable in highly skewed
Average KS2, by quintiles of financial difficulties score
# SEP indicators missing (out of 10) Iterative multivariable regression technique – switching regression Statas ice command 1.Specify a prediction equation for each variable 2.Randomly allocate values to missing cases 3.Predict values for missing cases 4.Update RHS variables and repeat cycle (10 times) Options allow choice of estimation method, passive imputation and substitution of RHS dummies, constrained intervals for predicted values Multiple Imputation by Chained Regression Current method: Imputation carried out using 10 SEP variables only – does not use other information Only a single imputed dataset created
Principal components analysis PCA provides a way of combining (weighting) the individual components into a single index PCA conducted on the 10x10 polychoric correlation matrix Standard PCA techniques assume continuous, normally distributed variables. Polychoric correlation can be used when there are binary and categorical components (e.g. education). It assumes that ordinal variables obtained by categorizing an normally distributed underlying variable. PCA extracts a single component that maximises the explained proportion of the variation in the (standardised) components Each component is assigned a scoring coefficient that is used as a weight in the construction of the SEP index
Principal components analysis Scoring coefficients: SEP index explains 46% of total variation in components
Summary ALSPAC contains numerous indicators that can be used to construct an SEP index Indicators vary in The type of resources they measure The sections of the population they distinguish (e.g. tenure appears good at picking out the very disadvantaged, but does not discriminate at the top of the distribution) The likelihood of non-response by different groups Issues that need to be considered when constructing an index: Which components should be included? (Should education be separate?) How should observations at multiple dates/by multiple respondents be treated? How should missing values be dealt with? How should the components be combined?