Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.

Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible strains We model where Y is phenotype, X – “genotype” matrix, π – probability of descending from strains, z – flanking markers The objective is to estimate β’s and to test for

Linear regression replace the true design matrix with E(X) and the estimator is given by - estimator is unbiased - normally distributed - linear in Y - has large variance due to collinearity Approaches

Approaches (2) Maximum likelihood estimator Maximize with respect to β: - expression simplifies - easy to evaluate point-wise - functional form not known, hence difficult to optimize - properties of the MLE are unknown

The two steps are: E-step, calculate for i-th mouse (only for categorical covariates) M-step, maximize Q w.r.t. β Advantages: Automatic Fast Approximate distribution of estimates allows to perform testing Easily generalised to GLM Approaches (3) Use a stochastic optimiser for finding MLE: the EM

M-step becomes equivalent to a Weighted Least Squares or a weighted GLM model (fitting routines available in R and Matlab): Where Y and X are augmented matrices, the weights matrix constructed using HMM output. Below there are only results for normal distribution of Y but the EM was applied to the binomial and exponential cases as well. Implementation of the EM

Given the phenotypes Y and the weights W we create the model: Augmenting the model with corresponding weights

Simulated example: generated phenotypes Response generated for set variance 0.3 and β = (1,0,0,0,0,0,0,0)

Values of β parameters at the EM iterations. The real values are (1,0,0,0,0,0,0,0). Running the EM

10 seconds - approximate running time for the WLS case - on 1,649 mice - implemented in Matlab - with convergence achieved at 15 iterations for some starting points 60 seconds - For 3,298 mice Running time

Likelihood ratio test performed for - the EM - linear regression with known design matrix - linear regression with the expectation of design matrix. Testing under collinearity

E(X) case null distribution Empirical null distributions EM algorithm null distribution

Description of the power of the LR test All β’s set to 0 except first one Simulate data sets and plot number of rejections For each value of β 500 simulations were performed Power curves

Simulated power curves Most likely combination of progenitor strains Randomly drawn combination of progenitor strains Least likely combination of progenitor strains

Considered OpenArmTime phenotype ~200 mice have zero records and were removed Is it a mixture of normal distributions? Data

Time to event models - Censored data - Cox proportional hazards model Bayesian models Implementation in R Models for multivariate phenotypes Multiple hypothesis testing HMM improvement Future development

Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.

Similar presentations

Presentation on theme: "Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.

Similar presentations

Presentation on theme: "Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible."— Presentation transcript:

Similar presentations

About project

Feedback