Presentation on theme: "Multilevel models for family data A presentation to the Research Methods Festival Oxford, July 2004. Tom OConnor Jon Rasbash Work conducted for the ESRC."— Presentation transcript:
Multilevel models for family data A presentation to the Research Methods Festival Oxford, July 2004. Tom OConnor Jon Rasbash Work conducted for the ESRC research methods programme project: Methodologies for Studying Families and Family Effects: the systematic assessment of research designs and data analytic strategies
The presentation looks at three applications of multilevel modelling to family data 1.Using multilevel models to explore the determinants of differential parental treatment of children. 2. Extending multilevel models to include genetic effects. 3. Applying multilevel models developed to handle social network data to family relationship data.
Application 1 Understanding the sources of differential parenting: the role of child and family level effects Jenny Jenkins, Jon Rasbash and Tom OConnor Developmental Psychology 2003(1) 99-113
Background Recent studies in developmental psychology and behavioural genetics emphasise non-shared environment is much more important in explaining childrens adjustment than shared environment has led to a focus on non-shared environment.(Plomin et al, 1994; Turkheimer&Waldron, 2000) Has this meant that we have ignored the role of the shared family context both empirically and conceptually?
Background One key aspect of the non-shared environment that has been investigated is differential parental treatment of siblings. Differential treatment predicts differences in sibling adjustment What are the sources of differential treatment? Child specific/non-shared: age, temperament, biological relatedness Can family level shared environmental factors influence differential treatment?
Parents have a finite amount of resources in terms of time, attention, patience and support to give their children. In families in which most of these resources are devoted to coping with economic stress, depression and/or marital conflict, parents may become less consciously or intentionally equitable and more driven by preferences or child characteristics in their childrearing efforts. Henderson et al 1996. This is the hypothesis we wish to test. We operationalised the stress/resources hypothesis using four contextual variables: socioeconomic status, single parenthood, large family size, and marital conflict The Stress/Resources Hypothesis Do family contexts(shared environment) increase or decrease the extent to which children within the same family are treated differently?
Previous analyses, in the literature exploring the sources of differential parental treatment ask mother to rate two siblings in terms of the treatment(positive or negative) they give to each child. The difference between these two treatment scores is then analysed. This approach has several major limitations… How differential parental treatment has been analysed
The sibling pair difference difference model, for exploring determinants of differential parenting Where y 1i and y 2i are parental ratings for siblings 1 and 2 in family I x 1i is a family level variable for example family ses Problems One measurement per family makes it impossible to separate shared and non-shared random effects. All information about magnitude of response is lost (2,4) are the same as (22,24) It is not possible to introduce level 1(non-shared) variables since the data has been aggregated to level 2. Family sizes larger than two can not be handled.
With a multilevel model… Where y ij is the jth mothers rating of her treatment of her ith child x 1ij are child level(non-shared variables), x 2j are child level(shared variables) u j and e ij are family and child(shared and non-shared environment) random effects. Note that the level 1 variance is now a measure of differential parenting
Advantages of the multilevel approach Can handle more than two kids per family Unconfounds family and child allowing estimation of family and child level fixed and random effects Can model parenting level and differential parenting in the same model.
Overall Survey Design National Longitudinal Survey of Children and Youth (NLSCY) Statistics Canada Survey, representative sample of children across the provinces Nested design includes up to 4 children per family PMK respondent 4-11 year old children Criteria: another sibling in the age range, be living with at least one biological parent, 4 years of age or older 8, 474 children 3, 860 families 4 child =60, 3 child=630, 2 child=3157
Measures of parental treatment of child Derived form factor analyses.. PMK report of positive parenting: frequency of praise of child, talk or play focusing on child, activities enjoyed together =.81 PMK report of negative parenting: frequency of disapproval, annoyance, anger, mood related punishment =.71 Will talk today about positive parenting PMK is parent most known to the child.
Child specific factors Age Gender Child position in family Negative emotionality Biological relatedness to father and mother Family context factors Socioeconomic status Family size Single parent status Marital dissatisfaction
Model 1: Null Model The base line estimate of differential parenting is 3.8. We can now add further shared and non-shared explanatory variables and judge their effect on differential parenting by the reduction in the level 1 variance.
Model 2 : expanded model
positive parenting Child level predictors Strongest predictor of positive parenting is age. Younger siblings get more attention. This relationship is moderated by family membership. Non-bio mother and Non_bio father reduce positive parenting Oldest sibling > youngest sibling > middle siblings Family level predictors Household SES increases positive parenting Marital dissatisfaction, increasing family size, mixed or all girl sib- ships all decrease positive parenting Lone parenthood has no effect.
Differential parenting Modelling age reduced the level 1 variance (our measure of differential parenting) from 3.8 to 2.3, a reduction of 40%. Other explanatory variables both child specific and family(shared environment) provide no significant reduction in the level 1 variation. Does this mean that there is no evidence to support the stress/resources hypothesis.
Testing the stress/resource hypothesis The mean and the variance are modelled simultaneously. So far we have modelled the mean in terms of shared environment but not the variance. We can elaborate model 2 by allowing the level 1 variance to be a function of the family level variables household socioeconomic status, large family size, and marital conflict. That is Reduction in the deviance with 7df is 78.
Conclusion We have found strong support for the stress/resources hypothesis. That is although differential parenting is a child specific factor that drives differential adjustment, differential parenting itself is influenced by family as well as child specific factors. This challenges the current tendency in developmental psychology and behavioural genetics to focus on child specific factors. Multilevel models fitting complex level 1 variation need to be employed to uncover these relationships.
Application 2 Including Genetic Effects in Multilevel models
Background Recent involvement in applying multilevel models to family data, collaborating with developmental psychologists. They asked can we include genetic effects in these models? Long tradition of quantitative genetics, arguably begun with Fishers 1918 paper The correlation between relatives based on the supposition of Mendalian inheritance This work has been developed by others and applied in Animal and plant genetics, evolutionary biology, human genetics and behavioural genetics.
The basic multilevel model, for kids within families Given the standard independence assumptions of multilevel models : The covariance of two children(i 1 and i 2 )in the same family is
Extending the basic model to include genetic effects Where g ij is a genetic effect for the ith child in the jth family. For two individuals (i 1,i 2 ) BUT The genetic covariance of two individuals in the same family, is clearly not zero since there is a non-zero probability that they share the same genes. What is F? This where Fishers 1918 paper comes in.
A very little genetics background First remember, humans have 23 pairs of chromosomes. A gene is a sequence of DNA at given location(locus) on a chromosome. In a population there might be multiple different versions of gene. For example, with two versions of a gene denoted by A and little a. There are 3 possible genotypes : AA Aa aa (Note Aa is functional equivalent to aA) We can think of the genes conferring values on an individual for a trait.
Given… a number of (strong) assumptions : 1. A metric trait is influenced by a large number of genes at a large number of loci(effectively infinite) 2. The effects of the genes add-up within and across loci 3. The genes are transmitted independently from parents to progeny. 4. The population being studied is mating at random 5. The population being studied is in evolutionary equilibrium. That is gene frequencies are not changing across generations. 6. There is no correlation between genetic and environmental effects. Corrections to the theory exist for all these assumptions, but I fear they are seldom used(in BG), are often difficult to implement and have not been thoroughly evaluated.
Then.. With a lot of complicated argument and algebra, Fisher shows that Where r (i1,i2) is the relationship coefficient between two individuals and equals (0,0.125,0.25,0.5,1) for unrelated individuals, cousins, half-sibs, full sibs and mz twins respectively. Thus the greater the relationship between individuals the greater their genetic covariance and therefore their phenotypic covariance. An individuals g ij is the sum of the effects of all their genes. The variance of these g ij is the additive genetic variance( g 2 ). The size of the additive genetic variance compared to other environmental variances is often of interest.
Data example 277 full sib pairs, 109 half sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ twins aged between 9 and 18 years. Analysis of depression scores : The total variance in the two models is effectively the same 0.275 in model 1 and 0.285 in model 2 In model 2, which includes genetic effects, 70% of the family level variation and 60% of child level variation are re- assigned to the genetic variance Like autocorrelation, time-series models except covariance decays as a function of genetic distance as opposed to distance in time between measurements. Can use the same estimation machinery. ParameterModel 1Model 2 FixedEst(se) Intercept0.008(0.017)0.02(0.017) Random Shared env0.086(0.011)0.018(0.017) Non-shared env0.198(0.011)0.069(0.010) Genetic-0.209(0.022) Deviance2165.882129.2
Adding covariates From the fixed effects we see that depression scores increase with child age, paternal and maternal negativity; girls and children in stepfamilies also have higher depression scores. The largest drop in the variance when these explanatory variables are introduced occurs in the genetic variance. Model 3 Fixed Est(se) Intercept-0.285(0.087) Age0.011(0.006) Mat_neg0.157(0.024) Pat_neg0.216(0.026) Girl0.158(0.028) Stepfam0.105(0.029) Random Shared env0.0035(0.014) Non-shared env0.70(0.096) Genetic0.148(0.020) Deviance1780.95
Why the drop in the genetic variance? The largest drop in the genetic variance occurs when paternal and maternal negativity are added to the model as covariates. Pike et al(1996) analyse the same data using a series of genetically calibrated bivariate structural equations models. Two of the models they consider are bivariate structural equations models for maternal negativity and depression and paternal negativity and depression. In each of these two models they find 15% of the genetic variance in depression is due to a shared genetic component with parental negativity. When we add paternal and maternal negativity to our model as fixed effects we are sweeping out any common genetic effects shared by parental negativity and adolescent depression. We are also taking account of any environmental correlations whereby sibling pairs of greater relatedness experience more similar parental treatment. Both these factors will reduce the remaining additive genetic variance in the model.
Complex variation and gene environment interactions Currently our model for the variance partitions the variance into three sources family, child and genetic. The model for the variance can be further elaborated to allow each of the three sources of variation to be modelled as functions of explanatory variables, where the variables may be measured at any level. That is
Gene environment interaction with paternal negativity We now elaborate model 3 to allow all three variances to be a function of paternal negativity. That is : (4)
Results from model 4 including the three extra parameters reduces the deviance by 19.5. This reduction is almost entirely driven by the gene environment interaction term, 1 (g). Removing the 1 (e) and 1 (u) terms from the model 4 results in a change in only 1.5 in the deviance. The significant coefficient constitutes a gene-environment interaction because it implies the genetic variance changes as a function of paternal negativity. Model 4 Fixed Est(se) Intercept-0.273(0.080) Age0.011(0.005) Mat_neg0.170(0.024) Pat_neg0.210(0.028) Girl0.159(0.027) Stepfam0.097(0.028) Random Shared env 0.0006(0.014) -0.017(0.019) Non-shared env 0.073(0.009) 0.0078(0.010) Genetic 0.155(0.021) 0.093(0.023) Deviance1740.42
Graphing the gene environment interaction One explanation of GXE interactions is in terms of conditional gene expression. Suppose we have a gene A which gets switched on when an individual is subject to persistent high levels of cortisol. If some of the population have the A gene and some dont then this genetic variation only manifests in individuals under persistent high levels of stress
Model Extensions The multilevel model with genetic effects is flexible and can be adapted to a variety of situations where population structures have further nested or crossed random classifications in addition to the standard behavioural genetics situation of children within families. For example, Time : repeated measures on kids within families Institutions: schools, hospitals Space : areas Multiple observers Complex example given in next section.
Application 3 Applying social network models to family relationship data-some preliminary work.
Substantive focus:trait-like versus context specific behaviour A question in personality theory is to what extent particular emotions and behaviours are trait-like in that they are constant across different environments, to what extent are they context sensitive in that behaviours expressed by individuals are specific to particular environments (Magnusson, 1990; Magnusson & Endler, 1977). Clinically, trait-like behaviours are harder to change since by definition altering the environment will tend not to change the behaviour.
Stability of behaviours over time Studies in personality have shown that happiness and positivity are very stable over time (Costa, McCrae & Zonderman, 1987); findings for the stability of negativity vary as a function of which aspect of negativity is being considered with some aspects such as physical aggression showing high stability (Broidy et al, 2003) and other aspects such as whining showing low stability (Capaldi et al., 1994). Trait-like behaviours are often considered to be driven by genetic factors (Plomin, 1994; Tellegen et al., 1988)
Stability of behaviours across family members In this presentation we develop multilevel models to explore the stability of an individuals behaviour across their family members. 2 kids and 2 parents per family, 12 relationship scores per family a relationship is made up of an actor and a partner. Note that positivity and negativity are not mirror images of the same underlying construct. Clinically and statistically they show independent patterns with evidence from neuropsychology that they are controlled by different brain systems (Caccioppo, Larsen, Smith & Berntson, 2004) positivitynegativity engagement(eye contact, body language) self-disclosing positive affect(warmth, praising) assertiveness(stating views clearly) anger coercion(dismissive or whining) negative affect(discontent) c1 c2 c1 m c1 f c2 c1 c2 m c2 f m c1 m c2 m f f c1 f c2 f m We look at the traits of positivity and negativity as responses.
The data Non-Shared Environment Adolescent Development(NEAD) data set, Reiss et al(1994). 2 wave longitudinal family study, designed for testing hypothesis about genetic and environmental effects 277 full-sib pairs, 109 half-sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ twins, aged between 9 and 18 years Wave 2 followed 3 years after wave 1 and any families where the older sib was older than 18 were not followed up. A wide range of self-report, parental-report and observer variables were collected. All families had 2 parents and 2 kids of the same sex. We focus here on data on relationship quality collected by observers.
Within family structure Family 1… Relationship: c1 c2 c1 m c1 f c2 c1 c2 m c2 f m c1 m c2 m f f c1 f c2 f m Actor: c1 c2 m f Partner: c1 c2 m f Dyad d1 d2 d3 d4 d5 d6 We start with 12 relationship scores in each family. These can be classified : partner and dyadactor
Family 1… Actor: c1 c2 m f Relationship: c1 c2 c1 m c1 f c2 c1 c2 m c2 f m c1 m c2 m f f c1 f c2 f m Partner: c1 c2 m f Dyad d1 d2 d3 d4 d5 d6 Diagrams to represent the structure family dyadactorpartner Relationship score The relationship scores are contained within a cross classification of actor, dyad and partner and all of this structure is nested within families. This can structure can be shown diagrammatically with: A unit diagram – one node per unitA classification diagram with one node per classification
The multilevel social relations model-Snijders and Kenny(1999) family dyad actor partner Relationship score is the effect for the mth family is the effect of jth actor in the mth family is the effect of kth partner in the mth family is the effect of lth dyad in the mth family is the residual relationship effect conditional on actor j, partner k, dyad l and family m
Interpretation of variance components Family:the extent to which family level factors effect all the relationships in a family. Actor: the extent to which individuals act similarly across relationships with other family members(actor stability, trait-like behaviour) Partner: We actually have two traits operating, in addition to the trait of common acting to other family members we also have the trait of elicitation from other family members. The greater the partner variance component the greater the evidence for such a trait operating. Dyad: The extent to which relationship quality is specific to the dyad. A high dyad random effect means that the relationship score from joe->fred is similar of that from fred->joe. In social network theory this is known as reciprocity. Reciprocity is a context specific effect(non trait-like) Relationship: residual variation across relationships in relationship quality.
Results of SRM more detail Pos SRM Neg SRM Family0.120.19 Actor0.440.12 Partner0.010.03 Dyad0.180.41 Relat.0.250.24 -2loglike10225.717800.9 Table shows variance partition coefficients For positivity 44% of the variablity is attributable to actors indicating that individuals act in a consistent way across relationships with other family members. There is a strong actor trait component to positivity. For negativity 0.41 of the variability is attributable to dyad. Indicating the dyad is an important structure in determining negativity in relationships. There is a strong context specific component to negativity. There is little evidence of an elicitation or partner trait for either response. At the family level there are stronger effects for negativity than positivity.
Adding fixed effects for role relation ship Actor_rolePartner_role childmotherfatherchildmotherfather c1 c2 100100 c1 m 100010 c1 f 100001 c2 c1 100100 c2 m 100110 c2 f 100001 m c1 010100 m c2 010100 m f 010001 f c1 001100 f c2 001100 f m 001010 The basic unit, a relationship, has an actor and a partner. Actors and partners are classified into the roles of children, mothers and fathers by the two categorical variables actor_role and partner_role. We use child as the reference category for actor_role and partner_role variables.
Including actor and partner roles-positivity param(se) fixed intercept2.834(0.011)2.263(0.014) a_mother-0.502(0.016) a_father-0.351(0.016) p_mother-0.021(0.011) p_father--0.032(0.011) random family0.034(0.004)0.050(0.004) actor0.124(0.005)0.061(0.004) partner0.003(0.002)0.001(0.002) dyad0.050(0.003)0.051(0.003) relationship0.073(0.002) -2loglike10225.79092.64 Modelling actor and partner role drops likelihood by over 1000 units with 4df. The effect is dominated by the actor role categories. With mothers and then fathers being much more positive as actors than the reference category child. These actor_role role variables explain over 50% of the actor level variance. Adding interactions between actor_role and partner-role does not improve the model. Since we have explained actor level variance this means actor role explains the some of the trait component of relationship positivity.
Graphing actor and partner role effects for positivity actor child actor m actor f The graph shows actor_role having a big effect on relationship quality and partner role having a marginal effect.
Including actor and partner roles-negativity param(se) fixed intercept0.348(0.018)0.729(0.027) a_mother--0.375(0.030) a_father--0.516(0.031) p_mother--0.319(0.028) p_father--0.625(0.028) a_moth*p_fath-0.359(0.040) a_fath*p_moth-0.563(0.040) random family0.137(0.012)0.144(0.012) actor0.082(0.006)0.087(0.006) partner0.022(0.005)0.018(0.004) dyad0.282(0.010)0.239(0.009) relationship0.165(0.005)0.162(0.005) -2loglike17800.917305.18 Modelling actor and partner role and the interaction drops the loglike by 500 units with 6df. Now an interaction is required between actor_role and partner_role. Note the interaction categories a_moth*p_moth and a_fath*p_fath structurally do not exist. } Note the main drop in the variance occurs at the dyad level which reduces by 15%. This means modelling actor and partner roles has explained context specific variation in relationship quality for negativity.
Graphing actor and partner role effects for negativity actor child actor m actor f With respect to actor and partner roles the main context specific effects for relationship quality occur in relationships where the child is an actor.. Note that parents are trait-like wrt actor negativity effects. A possible psychological explanation for this pattern is that negativity is high stakes behaviour. The amount of negativity a child feels safe to express is determined by the power/authority of the partner. Whether the partner is another child, a mother or a father greatly effects the negativity of the predicted relationship quality
Genetic effects Individuals exhibit some trait-like behaviour for both relationship positivity and negativity. With individuals exhibiting stronger trait-like behaviour for relationship positivity. Such trait-like behaviour may have a genetic component. The standard behavioural genetics model for children within families estimates shared environment(family), non-shared environment(individual) and genetic components of variation. Our structure is more complex in that the lowest level is not the individual but a relationship between two individuals. Also we have a dyad component of variation and the individual component of variation is split into actor and partner components. However, we can extend the basic BG model (which incorporates some questionable assumptions) to our structure. The extended model gives heritabilities (genetic variance)/(total variance) of 0.42 and 0.16 for positivity and negativity respectively.
Stability of effects over time The data has two waves where the same relationships were measured three years later. This allows us to explore the stability of family, actor, partner, dyad and relationship effects over time. We can operationalise the longitudinal structure by fitting a multivariate response social relations model where the first response is the time 1 relationship score and the second the time 2 relationship score. dyad actor partner Relationship score family dyad actor partner Relationship score family time 1 relationship score time 2 relationship score We simultaneously estimate all variance components for each response and the following correlations
Stability – results of two bivariate SRM PositivityNegativity w1 vpcw2 vpc 12 w1 vpcw2 vpc 12 family0.110.120.770.200.170.8 actor0.440.460.870.11 0.67 partner0.01 1.5??0.030.040.88 dyad0.170.120.150.420.410.34 relat.0.260.2184.108.40.2060.16 The basic patterns of the vpcs found in wave 1 are repeated in wave 2 for both positivity and negativity. Family effects are very stable over time for both positivity ( 12 = 0.77) and negativity ( 12 =0.8). Family effects are a bit stronger for negativity. Actor effects are stronger for positivity than negativity but stability across time is high for both actor behaviours(0.87 and 0.67) Dyad effects are much stronger for negativity than positivity. But the stability of dyad effects for both behaviours is lower than actor, partner and family effect stabilities. Dyads are more stable for negativity than positivity.
A comment on family effects Developmental psychology and behavioural genetics,.(Plomin et al, 1994; Turkheimer&Waldron, 2000). Have suggested that after taking account of genetic and individual level factors there is scant evidence for family level effects. Our work shows strong family level effects, that persist over time, even when genetic, actor, partner, dyad and relationship level variance components are included in the model. Part of the previous failure to find family effects may be the analytical strategy of breaking down families into series of overlapping dyads and analyising each dyad separately. This strategy is probably in part determined by the methodology available to the researchers.
A comment on dyad effects for relationship negativity For relationship negativity we saw large dyad effects and relatively low stability over time. This means that at wave 1 there is a large within family variability in dyad negativity and likewise at wave 2. However the dyads which are most and least negative within the family are to an extent switching around. The next step is to see if we can find some systematic pattern to these dyadic dynamics for relationship negativity.
In conclusion The multilevel social relations model offers a powerful framework for exploring within family dynamics and processes.