Mixed models Various types of models and their relation

Slides:



Advertisements
Similar presentations
SJS SDI_21 Design of Statistical Investigations Stephen Senn 2 Background Stats.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Copula Regression By Rahul A. Parsa Drake University &
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Linear regression models
The General Linear Model. The Simple Linear Model Linear Regression.
Maximum likelihood (ML) and likelihood ratio (LR) test
Hypothesis testing Some general concepts: Null hypothesisH 0 A statement we “wish” to refute Alternative hypotesisH 1 The whole or part of the complement.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Elementary hypothesis testing
Factor Analysis Purpose of Factor Analysis
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
Elementary hypothesis testing
Generalised linear models
Maximum likelihood (ML) and likelihood ratio (LR) test
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Repeated measures and mixed effect models
Log-linear and logistic models
Bayesian Learning, Cont’d. Administrivia Various homework bugs: Due: Oct 12 (Tues) not 9 (Sat) Problem 3 should read: (duh) (some) info on naive Bayes.
Generalised linear models Generalised linear model Exponential family Example: logistic model - Binomial distribution Deviances R commands for generalised.
Generalised linear models Generalised linear model Exponential family Example: Log-linear model - Poisson distribution Example: logistic model- Binomial.
Linear and generalised linear models
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Maximum likelihood (ML)
Simple Linear Regression Analysis
Review of Lecture Two Linear Regression Normal Equation
GEE and Generalized Linear Mixed Models
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Random Sampling, Point Estimation and Maximum Likelihood.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Chapter 5 Multilevel Models
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Group Analyses Guillaume Flandin SPM Course London, October 2016
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Evgeniya Anatolievna Kolomak, Professor
CONCEPTS OF ESTIMATION
EC 331 The Theory of and applications of Maximum Likelihood Method
Simple Linear Regression
Simple Linear Regression
Presentation transcript:

Mixed models Various types of models and their relation Mixed effect models: simple case Mixed effect models: Estimation of parameters Tests of hypothesis R functions for mixed effect models

Various forms of models and relation between them Classical statistics (Observations are random, parameters are unknown constants) LM: Assumptions 1) independent 2) Normal distribution, 3) constant unknown coefficients Repeated measures: Assumptions 1) and 3) is modified LMM: Assumptions 1) and 3) are modified GLMM: Assumption 2) Exponential family and assumptions 1) and 3) are modified GLM: assumption 2) Exponential family Time series Maximum likelihood: All assumptions can be modified NLM: Can be applied to all LM - Linear model GLM - Generalised linear model LMM - Linear mixed model GLMM - Generalised linear mixed model NLM - Non-linear model Conceptual difference Bayesian statistics: Coefficients as well as observations are random

Mixed effect models: motivation In linear and generalised linear models we assumed that 1) observations are independent on each other and have the same variances 2) Distribution is normal; 3) Parameters are constant (in linear model case): y = X+;  has Normal distribution N(0,2I);  is a vector unknown constants. This type of model is called fixed effect models. Topic of the last lecture (Lecture 10: Generalised linear models) was about the effect of removing one of the assumptions, namely assumption that observation are from population with normal distribution. What happens if we remove assumption 1) and 3). Then problem becomes more complicated and in general we need nx(n+1)/2 number of parameters to describe covariance structure of observations. Mixed effect models deal with these type of problems. In general this type of models bring classical statistics to a new level and allows to tackle such problems as: clustered data, repeated measures, hierarchical data.

Mixed effect models: Example Let us assume that we have a clinical trial. There is a drug. We want to test the effect of the different doses of the drug. We are interested only these dose levels. We randomly take n person and give to each of them one of the doses. Then the result of the experiment could be written: yij=+i+ij Where i is i-th dose, j is j-th person,  is average effect of the drug and  is effect of the drug specific to this particular dose,  is error. Our interest lies on effects of these doses and these doses alone. This type of model is fixed effect model. Now let us assume these doses were tested in 20 different clinics. Clinics were chosen randomly. Then we can write the model: yijk=+i+bj +cij ijk i is i-th dose, j is j-th clinic, k is the k-th patient. Since doses are only those doses we are interested in they are fixed, 20 clinics have been chosen randomly form population of all clinics they are random. We can not guarantee that effect of clinic and effect of dose is additive that is why we add c - interaction between clinics and doses. Since clinics are random then c must be random also. This is an example of mixed effect model. To solve the problem we need to find estimations overall effect (), effects of dose () and distribution of clinics (distribution of b and c).

Mixed or random It is often a challenging problem to decide if we should use fixed or mixed effect models. For example in drug and clinics case if we are going to use these drugs in all clinics (in case of successful results) then we should consider clinics as random but if drugs are very expensive and specialised and they are going to be used only in these clinics then we cannot consider these clinics as random. Then they should be considered as a fixed. Sometimes choice between random and fixed could be dictated by the amount of the data, information we have. If we have enough data to make inference about the population then we can use mixed effect models. If we do not have enough data then we can make inference only about different levels (e.g. doses of drugs, different clincis) of the variable of interest.

Mixed effect models: Simple model Let us consider model: yij=+ai+xj+ij M is overall intercept  constant coefficient on x (describes dependence of y on x), a is random intercept specific to i and  is random error. Let us assume that distribution of  is N(0, ) and the distribution of all ai-s are identically and independently distributed (i.i.d.) random variables with N(0, a). Now we can write the distribution of y: E(yij) =  +xi Var(yij) = a2+2 Cov(yij,yij’) = a2 Cov(yi’j,yij’) =0 for i’i We see that only two parameters are sufficient to describe the whole covariance structure of the observations. Now we can write multivariate normal distribution for joint probability distribution of the observations.

Mixed effect models: Simple model If we use notation V as variance of the observation then we can write of the distribution of the observation and therefore for likelihood: L(y|m,b,, a) = N(+x,V) Now the problem is to estimate parameters by maximising this likelihood function. The problem is usually solved iteratively: estimate parameters involved in mean assuming V constant and then estimate parameters involved in V.

Mixed effect models: Tests of hypothesis There are number of hypothesis that can be tested: 1) hypothesis involving parameters included in the mean -  and ; 2) hypothesis about parameters included in the covariance - V part: e.g. a=0. For these tests likelihood ratio test is used. In this particular case, both tests, after some manipulations come to F statistic.

General linear mixed effect models General mixed effect models can be written: y=X+Zu+ Where u is random variable with distribution N(0,D),  has distribution N(0, ), a is fixed. Then we can write: E(y)=X V(y)=Z DZT+2 I So if the distribution is the normal distribution then we can build joint probability distribution of all observations and therefore the likelihood function. Note that fixed effects are involved only in mean values (just like in linear model), random effect modify the covariance matrix of the observations, it is no longer diagonal and it means that observations are dependent on each other. Above equations are general form of the linear mixed effect models.

Simpler forms of linear mixed effect models If the structure of the data is known then it is possible to simplify covariance of the above described model. For example if we have two group of variables that are not dependent on each other. For example: let us assume we want to analyse performances of pupils in maths. We take n schools, in each school k classes and in each class l boys and m girls. In the model we would include one constant parameter for boys and one for girls (since these are only two options), then we would take random effect of schools (we are interested in all schools) and classes in these schools (we are interested in all classes in this school). Now it is reasonable to assume that there is no correlation between classes and schools. If class does not belong to the school then I do not know where correlation could come from, if class in the school then since school is considered as a random effect then correlation between classes and this school would be absorbed by the covariance of the school. So we have variance-covariance of schools and that of classes. Thinking about the system considerable simplifies the model we want to build.

Predicting random effects In mixed models we estimate parameters of fixed effects and distribution for random effects. Sometimes it is interesting to predict random effects. The expressions for fixed effect coefficients and for so called best linear unbiased prediction (BLUP) is est=(XT V-1X)-1XTV-1y upredict=DZTV-1(y-Xest)= DZTV-1 (I- (XT V-1X)-1XTV-1)y var(upredict)=DZTV-1(I- (XT V-1X)-1XTV-1)ZD Using these facts one can design tests of hypotheses, confidence intervals about u.

R commands for linear mixed models Commands for linear mixed models are in the library nlme: library(nlme) data(Orthodont) lm1 = lme(distance~age+Sex,data=Orthodont) lm1 summary(lm1)

References Demidenko E (2004) Mixed Models: Theory and applications McCullagh CE, Searle SR, (2001) Generalized, linear and mixed models

Exercise Take the data set esoph form and analyse using generalised linear model. Hints how to analyse this data set is at the end of the help page for this data set: ?esoph