Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions.

Similar presentations


Presentation on theme: "Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions."— Presentation transcript:

1 Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions  Identification of influential observations  Examination of over- and under-dispersion

2 Linear statistical models 2008 A simple model of water clarity Inputs: year, temperature, salinity, station dummies Output; Secchi depth (water clarity)

3 Linear statistical models 2008 Sampling sites for water quality in the Stockholm archipelago Stockholm Baltic Sea

4 Linear statistical models 2008 Raw residuals in generalized linear models The predicted values are linear combinations of the observed values, i.e.. where H is a symmetric idempotent matrix ( H = H*H ) The vector of raw residuals can be written In contrast to residuals in general linear models, the raw residuals in glims may have a variance that is strongly related to the size of

5 Linear statistical models 2008 Pearson residuals The Pearson residual is the raw residual standardized with the standard deviation of the fitted value Special cases: Poisson and binomial models

6 Linear statistical models 2008 Adjusted Pearson residuals The Pearson residual can be adjusted by computing where h ii is the i th diagonal element of the ‘hat’ matrix H. The adjusted Pearson residuals can often be assumed to be approximately standard normal.

7 Linear statistical models 2008 Deviance The deviance is defined as where is the log likelihood of the full (saturated) model, and is the log likelihood of the current model at the ML-estimates of its parameters. The deviance is a sum of the contributions to the deviance from each of the observations

8 Linear statistical models 2008 Deviance residuals The (unadjusted) deviance residuals are defined as The adjusted deviance residuals are defined as where h ii is the i th diagonal element of the ‘hat’ matrix H.

9 Linear statistical models 2008 Score residuals The score equations involve sums of terms U i, one for each observation. Properly standardized these terms can be regarded as residuals

10 Linear statistical models 2008 Approximate likelihood residuals Likelihood residuals may, in principle, be computed by comparing the deviance for a model based on all observations with the deviance for a model based on all but the i th observation An approximation of these residuals is given by the formula

11 Linear statistical models 2008 Choice of residuals Type of residualsTest Pearson residualsLikelihood ratio test Deviance residualsWald tests Score residualsScore tests Likelihood residuals

12 Linear statistical models 2008 Influential observations The leverage (influence) of observation i on the fitted value is the derivative of this estimate with respect to y i. Because these derivatives are given by the diagonal elements h ii of the ‘hat’ matrix H.

13 Linear statistical models 2008 Cook’s distance The combined change in all parameters when observation i is omitted can be computed as

14 Linear statistical models 2008 Over-dispersion Over-dispersion occurs when the variance of the response is larger than would be expected for the chosen distribution. Example: In a model involving Poisson distributions, the estimated variance is considerably larger than the estimated mean.

15 Linear statistical models 2008 Possible causes of over-dispersion Lack of homogeneity (the distribution of the target variable varies within experiments that are assumed to be replicates) Dependence (the response levels in experiments assumed to be replicates are actually positively correlated)

16 Linear statistical models 2008 Modelling over-dispersion Introduce an extra scale parameter  in the variance function of the response Y. Note that the variance is a function of the mean for all members of the exponential family.

17 Linear statistical models 2008 Software recommendations General linear models MINITAB Generalized linear models SAS,proc GENMOD


Download ppt "Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions."

Similar presentations


Ads by Google