Presentation is loading. Please wait.

Presentation is loading. Please wait.

Univariate Twin Analysis- Saturated Models for Continuous and Categorical Data September 2, 2014 Elizabeth Prom-Wormley & Hermine Maes

Similar presentations


Presentation on theme: "Univariate Twin Analysis- Saturated Models for Continuous and Categorical Data September 2, 2014 Elizabeth Prom-Wormley & Hermine Maes"— Presentation transcript:

1 Univariate Twin Analysis- Saturated Models for Continuous and Categorical Data September 2, 2014 Elizabeth Prom-Wormley & Hermine Maes

2 Overall Questions to be Answered Does the data satisfy the assumptions of the classical twin study? Does a trait of interest cluster among related individuals? 2

3 Family & Twin Study Designs Family Studies Classical Twin Studies Adoption Studies Extended Twin Studies 3

4 The Data Please open twinSatConECPW Fall2014.R Australian Twin Register years old, males and females Work from this session will focus on Body Mass Index (weight/height2) in females only Sample size – MZF = 534 complete pairs (zyg = 1) – DZF = 328 complete pairs (zyg = 3) total MZ pairs 351 total DZ pairs 569 total MZ pairs 351 total DZ pairs

5 A Quick Look at the Data 5

6 Classical Twin Studies Basic Background The Classical Twin Study (CTS) uses MZ and DZ twins reared together – MZ twins share 100% of their genes – DZ twins share on average 50% of their genes Expectation- Genetic factors are assumed to contribute to a phenotype when MZ twins are more similar than DZ twins 6

7 Classical Twin Study Assumptions MZ twins are genetically identical Equal Environments of MZ and DZ pairs 7

8 Basic Data Assumptions MZ and DZ twins are sampled from the same population, therefore we expect :- – Equal means/variances in Twin 1 and Twin 2 – Equal means/variances in MZ and DZ twins Further assumptions would need to be tested if we introduce male twins and opposite sex twin pairs 8

9 “ Old Fashioned ” Data Checking 9 MZDZ T1T2T1T2 mean variance covariance (T1- T2) Nice, but how can we actually be sure that these means and variances are truly the same?

10 Univariate Analysis A Roadmap Use the data to test basic assumptions (equal means & variances for twin 1/twin 2 and MZ/DZ pairs) Saturated Model 2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype ACE or ADE Models 3- Test ACE (ADE) submodels to identify and report significant genetic and environmental contributions AE or CE or E Only Models 10

11 Saturated Twin Model 11

12 Saturated Code Deconstructed 12 mMZ1 mMZ2 mDZ1 mDZ2 mean MZ = 1 x 2 matrixmean DZ = 1 x 2 matrix meanMZ <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=meVals, labels=c("mMZ1","mMZ2"), name=”meanMZ" ) meanDZ <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=meVals, labels=c("mDZ1","mDZ2"), name=”meanDZ" )

13 Saturated Code Deconstructed 13 vMZ1cMZ21 vMZ2 T1 T2 T1 T2 vDZ1cDZ21 vDZ2 T1T2 T1 T2 covMZ = 2 x 2 matrix covDZ = 2 x 2 matrix covMZ <- mxMatrix( type="Symm", nrow=ntv, ncol=ntv, free=TRUE, values=cvVals, lbound=lbVals, labels=c("vMZ1","cMZ21","vMZ2"), name=”covMZ" ) covDZ <- mxMatrix( type="Symm", nrow=ntv, ncol=ntv, free=TRUE, values=cvVals, lbound=lbVals, labels=c("vDZ1","cDZ21","vDZ2"), name=”covDZ" )

14 Time to Play... Continue with the File twinSatConECPW Fall2014.R 14

15 Estimated Values T1T2T1T2 Saturated Model meanMZDZ covT1 T2 10 Total Parameters Estimated Standardize covariance matrices for twin pair correlations (covMZ & covDZ) mMZ1, mMZ2, vMZ1,vMZ2,cMZ21 mDZ1, mDZ2, vDZ1,vDZ2,cDZ21

16 Estimated Values Total Parameters Estimated Standardize covariance matrices for twin pair correlations (covMZ & covDZ) mMZ1, mMZ2, vMZ1,vMZ2,cMZ21 mDZ1, mDZ2, vDZ1,vDZ2,cDZ21 T1T2T1T2 Saturated Model meanMZ DZ covT10.73T10.77 T T

17 Fitting Nested Models Saturated Model – likelihood of data without any constraints – fitting as many means and (co)variances as possible Equality of means & variances by twin order – test if mean of twin 1 = mean of twin 2 – test if variance of twin 1 = variance of twin 2 Equality of means & variances by zygosity – test if mean of MZ = mean of DZ – test if variance of MZ = variance of DZ 17

18 Estimated Values T1T2T1T2 Equate Means & Variances across Twin Order meanMZDZ covT1 T2 Equate Means Variances across Twin Order & Zygosity meanMZDZ covT1 T2

19 Estimates 19 T1T2T1T2 Equate Means & Variances across Twin Order meanMZ21.35 DZ21.45 covT10.76T10.79 T T Equate Means Variances across Twin Order & Zygosity meanMZ21.39 DZ21.39 covT10.78T10.78 T T

20 Stats 20 Modelep-2lldfAIC diff - 2ll diff df p Saturated mT1=mT mT1=mT2 & varT1=VarT Zyg MZ=DZ No significant differences between saturated model and models where means/variances/covariances are equal by zygosity and between twins

21 Working with Binary and Ordinal Data Elizabeth Prom-Wormley and Hermine Maes Special Thanks to Sarah Medland

22 Transitioning from Continuous Logic to Categorical Logic Ordinal data has 1 less degree of freedom compared to continuous data MZcov, DZcov, Prevalence No information on the variance Thinking about our ACE/ADE model 4 parameters being estimated A/ C/ E/ mean ACE/ADE model is unidentified without adding a constraint

23 Two Approaches to the Liability Threshold Model Traditional – Maps data to a standard normal distribution – Total variance constrained to be 1 Alternate – Fixes an alternate parameter (usually E) – Estimates the remaining parameters

24 Time to Look at the Data! Please open BinaryWarmUp.R

25 Observed Binary BMI is Imperfect Measure of Underlying Continuous Distribution Mean (bmiB2) = 0.39 SD (bmiB2) = 0.49 Prevalence “ low ” BMI = 60.6% We are interested in the liability of risk for being in the “ high ” BMI category

26 It ’ s Helpful to Rescale Raw Data (Unstandardized) mean=0.49, SD=0.39 -Data not mapped to a standard normal -No easy conversion to % -Difficult to compare between groups Since the scaling is now arbitrary Standard Normal (Standardized) mean=0, SD=1 Area under the curve between two z-values is interpreted as a probability or percentage

27 Binary Review Threshold calculated using the cumulative normal distribution (CND) -We used frequencies and inverse CND to do our own estimation of the threshold qnorm(0.816) = Threshold is the Z Value that corresponds with the proportion of the population having “ low BMI ”

28 Moving to Ordinal Data!

29 Getting a Feel for the Data Open twinSatOrd.R Calculate the frequencies of the 5 BMI categories for the second twins of the MZ pairs CrossTable(mzDataOrdF$bmi2)

30 Estimating MZ Twin 2 Thresholds by Hand T1 = qnorm(0.124) T1 = T2 = qnorm( ) T2 = T3 = qnorm( ) T3 =0.388 T4 = qnorm( ) T4 = Estimate Twin Pair Correlations for the Liabilities Too!

31 Translating Back to the SEM Approach in OpenMx

32 Handling Ordinal Data in OpenMx 1- Determine the 1 st threshold 2- Determine displacements between 1 st threshold and subsequent thresholds 3- Add the 1 st threshold and the displacement to obtain the subsequent thresholds

33 Ordinal Saturated Code Deconstructed Defining Threshold Matrices threM <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=TRUE, values=thVal, lbound=thLB, labels=thLabMZ, name="ThreMZ" ) threD <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=TRUE, values=thVal, lbound=thLB, labels=thLabDZ, name="ThreDZ" ) t1MZ1t1MZ2 t2MZ1t2MZ2 t3MZ1t3MZ2 t4MZ1t4MZ2 t1DZ1t1DZ2 t2DZ1t2DZ2 t3DZ1t3DZ2 t4DZ1t4DZ2 1 L T1 1 cov MZ L T2 Variance Constraint Threshold Model 1 L T1 1 cov DZ L T2 1 μ MZT1 0 μ MZT2 0 1 μ DZT1 0 μ DZT2 0

34 Ordinal Saturated Code Deconstructed Defining Threshold Matrices- ThreMZ 1- Determine the 1st threshold Tw1Tw Determine displacements between1 st thresholds and subsequent thresholds

35 Double Check- Moving from Frequencies to Displacements Frequency BMI T2 Cumulative Frequency Z ValueDisplacement

36 Ordinal Saturated Code Deconstructed Estimating Expected Threshold Matrices threMZ <- mxAlgebra( expression= Inc %*% ThreMZ, name="expThreMZ" ) Inc <- mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="Inc" ) % * % = Add the 1st threshold and the displacement to obtain the subsequent thresholds

37 Ordinal Saturated Code Deconstructed Estimating Correlations & Fixing Variance corMZ <- mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=corVals, lbound=lbrVal, ubound=ubrVal, labels="rMZ", name="expCorMZ" ) corDZ <- mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=corVals, lbound=lbrVal, ubound=ubrVal, labels="rDZ", name="expCorDZ" )

38 How Many Parameters in this Ordinal Model? MZ correlation- rMZ DZ correlation- rDZ Thresholds – t1MZ1,t2MZ1,t3MZ1,t4MZ1 – t1MZ2,t2MZ2,t3MZ2,t4MZ2 – t1DZ1,t2DZ1,t3DZ1,t4DZ1 – t1DZ2,t2DZ2,t3DZ2,t4DZ2

39 Problem Set 1 Open twinSatOrdA.R and twinSatOrd.R – What do these scripts do? – Looking at the scripts only: How are they similar? How are they different? – What do these differences in the scripts reflect regarding conceptual differences in the two models? Run either script and double check against your previously hand-calculated values of thresholds. Report your results. If you can’t get it to match up, don’t panic…do . Run twinSatOrd.R – Is testing an ACE model with the usual model assumptions justified? Why or why not?

40 Univariate Analysis with Ordinal Data A Roadmap 1- Use the data to test basic assumptions inherent to standard ACE (ADE) models Saturated Model 2- Estimate contributions of genetic and environmental effects on the liability of a trait ADE or ACE Models 3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions AE or E Only Models

41 Questions? 41


Download ppt "Univariate Twin Analysis- Saturated Models for Continuous and Categorical Data September 2, 2014 Elizabeth Prom-Wormley & Hermine Maes"

Similar presentations


Ads by Google