SEM with Measured Genotypes NIDA Workshop VIPBG, October 2012 Maes, H. H., Neale, M. C., Chen, X., Chen, J., Prescott, C. A., & Kendler, K. S. (2011).

Slides:

Advertisements

Similar presentations

Statistical methods for genetic association studies

Advertisements

Population Genetics 1 Chapter 23 in Purves 7 th edition, or more detail in Chapter 15 of Genetics by Hartl & Jones (in library) Evolution is a change in.

Bivariate analysis HGEN619 class 2007.

Elizabeth Prom-Wormley & Hermine Maes

Multivariate Mx Exercise D Posthuma Files: \\danielle\Multivariate.

Basics of Linkage Analysis

Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.

Robert M. La Follette School of Public Affairs GWAS Panel Jason Fletcher Associate Professor Public Affairs, Sociology, and Applied Economics University.

Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.

Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.

Estimating “Heritability” using Genetic Data David Evans University of Queensland.

(Re)introduction to OpenMx Sarah Medland. Starting at the beginning  Opening R Gui – double click Unix/Terminal – type R  Closing R Gui – click on the.

Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.

Quantitative Genetics

Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.

Longitudinal Modeling Nathan, Lindon & Mike LongitudinalTwinAnalysis_MatrixRawCon.R GenEpiHelperFunctions.R jepq.txt.

Biometrical Genetics Pak Sham & Shaun Purcell Twin Workshop, March 2002.

Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.

Ordinal data Analysis: Liability Threshold Models Frühling Rijsdijk SGDP Centre, Institute of Psychiatry, King’s College London.

Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.

Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.

Linkage Analysis in Merlin

Karri Silventoinen University of Helsinki Osaka University.

Karri Silventoinen University of Helsinki Osaka University.

Broad-Sense Heritability Index

Hardy-Weinberg equilibrium. Is this a ‘true’ population or a mixture? Is the population size dangerously low? Has migration occurred recently? Is severe.

 Go to Faculty/marleen/Boulder2012/Moderating_cov  Copy all files to your own directory  Go to Faculty/sanja/Boulder2012/Moderating_covariances _IQ_SES.

Introduction to Multivariate Genetic Analysis (2) Marleen de Moor, Kees-Jan Kan & Nick Martin March 7, 20121M. de Moor, Twin Workshop Boulder.

Introduction to OpenMx Sarah Medland. What is OpenMx? Free, Open-source, full–featured SEM package Software which runs on Windows, Mac OSX, and Linux.

Cholesky decomposition May 27th 2015 Helsinki, Finland E. Vuoksimaa.

Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)

Practical SCRIPT: F:\meike\2010\Multi_prac\MultivariateTwinAnalysis_MatrixRaw.r DATA: DHBQ_bs.dat.

Experimental Design and Data Structure Supplement to Lecture 8 Fall

Combined Linkage and Association in Mx Hermine Maes Kate Morley Dorret Boomsma Nick Martin Meike Bartels Boulder 2009.

Threshold Liability Models (Ordinal Data Analysis) Frühling Rijsdijk MRC SGDP Centre, Institute of Psychiatry, King’s College London Boulder Twin Workshop.

Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.

Longitudinal Modeling Nathan & Lindon Template_Developmental_Twin_Continuous_Matrix.R Template_Developmental_Twin_Ordinal_Matrix.R jepq.txt GenEpiHelperFunctions.R.

Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.

Attention Problems – SNP association Dorret Boomsma Toos van Beijsterveldt Michel Nivard.

Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.

Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.

Introduction to Genetic Theory

Biometrical Genetics Shaun Purcell Twin Workshop, March 2004.

Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.

QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.

March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.

Association Mapping in Families Gonçalo Abecasis University of Oxford.

Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.

HS-LS-3 Apply concepts of statistics and probability to support explanations that organisms with an advantageous heritable trait tend to increase in proportion.

Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.

Hardy Weinberg Equilibrium, Gene and Genotypic frequencies

Univariate Twin Analysis

Measurement invariance in the linear factor model: practical

Introduction to OpenMx

Introduction to Multivariate Genetic Analysis

Fitting Univariate Models to Continuous and Categorical Data

Re-introduction to openMx

Genetics Definitions Definition Key Word

Univariate modeling Sarah Medland.

Pak Sham & Shaun Purcell Twin Workshop, March 2002

Longitudinal Modeling

Exercise: Effect of the IL6R gene on IL-6R concentration

Sarah Medland faculty/sarah/2018/Tuesday

p(A) = p(AA) + ½ p(Aa) p(a) = p(aa)+ ½ p(Aa)

Bivariate Genetic Analysis Practical

Including covariates in your model

A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants Andrew.

BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS

Multivariate Genetic Analysis: Introduction

Presentation transcript:

SEM with Measured Genotypes NIDA Workshop VIPBG, October 2012 Maes, H. H., Neale, M. C., Chen, X., Chen, J., Prescott, C. A., & Kendler, K. S. (2011). A twin association study of nicotine dependence with markers in the CHRNA3 and CHRNA5 genes. Behav Genet, 41(5), doi: /s z

What would ACE look like if we knew the genes and environments?

Where we’d like to be

Molecular Studies using Relatives Association studies Often based on genotyping unrelated individuals Statistical models for relatives have been extended to included measured genotypes van den Oord et al. (2002), Merlin (Abecasis, 2002) Genotype data added to twin/family studies Increased power from family design Problem: Some relatives with phenotypes without genotypes Power of association studies can be increased if incompletely genotyped families are retained in analyses (Visscher et al 2008)

The Tobacco and Genetics Consortium Nature Genetics (2010)

Meta-Analyses of GWAS of Smoking Three Meta-analyses TAG 2010; Thorgeirson et al. 2010; Liu et al > 100,000 individuals Several genome-wide significant results Initiation: BDNF, Cessation: DBH Consumption Neuronal acetylcholine receptor subunit genes SNPs in CHRNA5 and CHRNA3CHRNA5CHRNA3 rs First identified by Saccone et al (2007)

Goal: test whether nicotine dependence is linked to nicotinic receptor variants printACE(AceFit) a^2 c^2 e^2 aS^2 [1,] printACE(AcegFit) a^2 c^2 e^2 aS^2 [1,] mxCompare(AcegFit, AceFit) base comparison ep minus2LL df AIC diffLL diffdf p 1 ACEg NA NA NA 2 ACEg ACEonly

Twin Association Model Traditional Twin Model Measured Genotypes as covariates in Means Model Quantify contributions of specific variants as well as background genetic and environmental factors

Twin Association Model

Expected Means based on allelic effects of SNPs Population mean = “m” Allele at a particular locus = either “A” or “a” SNP effect modeled as deviations from “m” Additive (aS) or dominant (dS) SNP effect model Expected mean for AA homozygote = m + aS Aa heterozygotes = m + dS aa homozygote = m – aS

MZs: 1 of 3 classes +aS dS -aS

DZs: 1 of 3x3 classes

Twin Data Availability Zygosity Twin Data AvailabilityMZDZ Combination Genotype d Phenotype d twin 1twin 2twin 1twin 2 1 both GP 2twin 1GPG G 3twin 2GGPG 4neitherGGGG 5 one bothGPP P 6twin 1GP 7twin 2GPGP 8neitherGG 9 bothPPPP 10twin 1PP 11twin 2PP 12neither

Missing Genotypes One MZ twin genotyped One twin or both twins phenotyped > Assign co-twin genotype to un- genotyped co-twin One DZ twin genotyped One or both phenotyped > Use allele frequencies to assign a probability of belongingness to each of 3 possible classes based on genotyped twin Neither MZ twin genotyped One or both twins phenotyped > Assign probability of membership in any of the 3 possible genotype classes based on allele frequencies Neither DZ twin genotyped One or both twins phenotyped > Assign probability of membership in any of the 9 possible genotype classes based on allele frequencies

So, how do we do add substitute values for our missing genotype data? Need to know the expected proportions of each genotype in the cases of twin1 and or twin2 missing for MZ and for DZ Need to code our data in such as way as to allow us to fill in the three possible values for missing MZ data and the 9 possible values in the case of one or more missing DZ twin genotypes in a pair.

Expected proportion of each genotype based on allele frequencies GenotypeExpected ProportionExpected Mean T1T2MZDZT1T2 AA p2p2 p 4 + p 3 q + (pq) 2 /4gm +aS AAAa0p 3 q + (pq) 2 /4gm +aSgm +dS AAaa0(pq) 2 /4gm +aSgm -aS AaAA0p 3 q + (pq) 2 /4gm +dSgm +aS Aa 2*pqp 3 q + 3(pq) 2 + pq 3 gm +dS Aaaa0pq 3 + (pq) 2 /4gm +dSgm -aS aaAA0(pq) 2 /4gm -aSgm +aS aaAa0pq 3 + (pq) 2 /4gm -aSgm +dS aa q2q2 q 4 + pq 3 + (pq) 2 /4gm -aS expected proportion of each of genotypic categories of twin pairs calculated based on allele frequencies obtained from total sample of genotyped individuals

Let’s look at the dataset… str(selData) 'data.frame': 850 obs. of 9 variables: $ zyg : int $ rs10a11: int $ rs10a12: int $ rs10a13: int $ rs10a21: int $ rs10a22: int $ rs10a23: int $ ftnd1 : int NA NA NA NA 4 NA... $ ftnd2 : int NA NA 5 9 NA NA NA 8 NA NA...

Recode Genotypes into 3 columns (to map into the 9 genotype classes) rs#11 rs#12 rs#13 if rs10a1 = 2 [AA]  1, 0, 0 if rs10a1 = 1 [Aa]  0, 1, 0 if rs10a1 = 0 [aa]  0, 0, 1 if rs10a1 = NA [??]  1, 1, 1 mzGen1 = c(rs#11, rs#12, rs#13) Now we can multiple these 1*3 matrices to get a 9-cell vector with 1s in the “possible” co-twin genotypes vector[ t(Gen1) %*% Gen2 ]

Individual Proportions mzGen1| mzGen2 > mzGenProb mzN x 6 mzN x 9 # mzN = number of MZ pairs mzGenComb = vector(t(mzGen1) %*% mzGen2) mzGenProb = mzGenComb %*% mzProb / (mzGenComb %*% (mzProb %*% U)) # note: “%*% U” Sums all the probabilities (Einstein addition)

Matrices for Genotype # Matrices to store effect of genotype mxMatrix(name="mean", type="Full", nrow=nv, ncol=nv, free=T, values=0, label="gm"), mxMatrix(name="addSNP", type="Full", nrow=nv, ncol=nv, free=T, values=0, label="aS"), mxMatrix(name="domSNP", type="Full", nrow=nv, ncol=nv, free=F, values=0, label="dS"), mxMatrix(name="pSNP", type="Full", nrow=nv, ncol=nv, free=F, values=allelep), mxAlgebra(1-pSNP, name="qSNP"), mxAlgebra(2 * pSNP * qSNP * addSNP^2, name = "S"), mxAlgebra(V+S, name="totalV"), mxAlgebra((cbind(A,C,E,S)) %x% solve(totalV), name = "stVarCom"), mxAlgebra(cbind(V, A, C, E, S, stVarCom), name = "allVarCom"), mxMatrix(name="U9", type = "Unit", nrow = 9, ncol = 1),

Expected Mean Vector # Matrix & Algebra for expected means vector and expected thresholds mxAlgebra(rbind(mean+addSNP,mean+domSNP,mean-addSNP), name="mean3"), mxAlgebra( cbind(mean+addSNP,mean+addSNP), name="expMean_AAAA"), mxAlgebra( cbind(mean+addSNP,mean+domSNP), name="expMean_AAAa"), mxAlgebra( cbind(mean+addSNP,mean-addSNP), name="expMean_AAaa"), mxAlgebra( cbind(mean+domSNP,mean+addSNP), name="expMean_AaAA"), mxAlgebra( cbind(mean+domSNP,mean+domSNP), name="expMean_AaAa"), mxAlgebra( cbind(mean+domSNP,mean-addSNP), name="expMean_Aaaa"), mxAlgebra( cbind(mean-addSNP,mean+addSNP), name="expMean_aaAA"), mxAlgebra( cbind(mean-addSNP,mean+domSNP), name="expMean_aaAa"), mxAlgebra( cbind(mean-addSNP,mean-addSNP), name="expMean_aaaa"),

Expected Thresholds, Covariances mxMatrix( type="Full", nrow = nth, ncol = nv, free = c(F,F,rep(T,nth-2)), values=thValues, lbound=thLBound, name="Thre”), mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="Inc" ), mxAlgebra(Inc %*% Thre, name="ThreInc"), mxAlgebra(cbind(ThreInc,ThreInc), dimnames=list(thRows,selVars), name="expThre"), # Algebra for expected variance/covariance matrices mxAlgebra((rbind(cbind(A+C+E, A+C), cbind(A+C, A+C+E))), name="expCovMZ"), mxAlgebra((rbind (cbind(A+C+E, 0.5%x%A+C), cbind(0.5%x%A+C, A+C+E))), name="expCovDZ")

mxModel(“MZ_”, mxData(mzData, type="raw" ), mxModel("MZ_AA", mxFIMLObjective( "ACE.expCovMZ", "ACE.expMean_AAAA", selVars,"ACE.expThre", vector=T) ), mxModel("MZ_Aa", mxFIMLObjective("ACE.expCovMZ", "ACE.expMean_AaAa", selVars,"ACE.expThre", vector=T) ), mxModel("MZ_aa", mxFIMLObjective("ACE.expCovMZ","ACE.expMean_aaaa", selVars,"ACE.expThre", vector=T) ), mxMatrix("Full",mzN,9,F, values=mzGenProb, name="mzWeights"), mxMatrix("Zero",mzN,1, name="Zero"), mxAlgebra(-2 * sum(log((mzWeights * cbind( MZ_AA.objective, Zero, Zero, Zero, MZ_Aa.objective, Zero, Zero, Zero, MZ_aa.objective)) %*%ACE.U9)), name="MZmix"), mxAlgebraObjective("MZmix")

mxModel(“DZ_”, mxData(dzData, type="raw"), mxModel("DZ_AAAA", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAAA", selVars, "ACE.expThre", vector=T)), mxModel("DZ_AAAa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAAa", selVars, "ACE.expThre", vector=T)), mxModel("DZ_AAaa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAaa", selVars, "ACE.expThre", vector=T)), mxModel("DZ_AaAA”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AaAA", selVars, "ACE.expThre", vector=T)), mxModel("DZ_AaAa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AaAa", selVars, "ACE.expThre", vector=T)), mxModel("DZ_Aaaa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_Aaaa", selVars, "ACE.expThre", vector=T)), mxModel("DZ_aaAA”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaAA", selVars, "ACE.expThre", vector=T)), mxModel("DZ_aaAa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaAa", selVars, "ACE.expThre", vector=T)), mxModel("DZ_aaaa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaaa", selVars, "ACE.expThre", vector=T)), mxMatrix(name="dzWeights", type= "Full",nrow=dzN,ncol=9,free=F, values=dzGenProb), mxMatrix(name="Zero", type="Zero",nrow=dzN,ncol=1), mxAlgebra(name="DZmix", expression = -2*sum(log((dzWeights * cbind(DZ_AAAA.objective, DZ_AAAa.objective, DZ_AAaa.objective, DZ_AaAA.objective, DZ_AaAa.objective, DZ_Aaaa.objective, DZ_aaAA.objective, DZ_aaAa.objective, DZ_aaaa.objective)) %*%ACE.U9)), ), mxAlgebraObjective("DZmix”)

Goal: test whether nicotine dependence is linked to nicotinic receptor variants mxCompare(AcegFit, AceFit) base comparison ep minus2LL df AIC diffLL diffdf p 1 ACEg NA NA NA 2 ACEg ACEonly printACE(AceFit) a^2 c^2 e^2 aS^2 [1,] printACE(AcegFit) a^2 c^2 e^2 aS^2 [1,]