# If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/

## Presentation on theme: "If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/"— Presentation transcript:

If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/ [1 + exp(a+bX) ]

Logistic model is a linear model, on a different scale than the linear risk model: log(Pr(outcome=1)/[1 – Pr(outcome=1) ] = a + bX

Ordinal outcomes:

Ungrouped Poisson Regression Typically the model is applied to aggregated units of observation (groups or strata) for which total counts, total units of observation, and group-level covariates are recorded Collapsing covariates into group-level covariates can introduce bias and loose information Ungrouped Poisson Regression methods have been developed use individual, time-varying covariate information estimate effects of covariates on rates Estimates similar to those from proportional hazards models

Ungrouped Poisson Regression Loomis et al (2005) Poisson regression analysis of ungrouped data. OccEnvirnMed 62:325-329.

Ungrouped Poisson Regression Loomis et al (2005) Poisson regression analysis of ungrouped data. OccEnvirnMed 62:325-329.

Survival Analyses –Survival analysis is a set of statistical techniques whose goal is to predict the time of (or time until) an event –The dependent variable is the time to occurrence of a specific event (t e ) from some time (t 0 ) –An event is a qualitative change in some attribute –People who do not have the event during the follow-up are said to be censored –The independent variable may be a treatment, a clinical characteristic, a demographic characteristic, or some other predictor of survival –Examples include different treatments, high/low blood pressure, or treating hospital

–Examples –Predicting the life expectancy of a group: –Do smokers die younger than non-smokers? –The event is death –Measure the efficacy of a treatment –Does AZT delay the onset of pneumoncystis carini pneumonia in people who are HIV positive? –The event is pneumonia diagnosis –Measure the role of multiple predictors on the time until an outcome: –Can you predict time to conception, using lots of predictors, in couple undergoing fertility treatment? –The event is conception Survival Analyses

–Examples (continued) –Survival analysis deals with special problems in each of the previous studies –The smoking study –The subjects most likely began smoking at different ages and you need to account for the various years at risk –Some subjects may disappear or die from trauma –AZT study –The exact date of exposure to the virus is rarely known –Time periods may exist where participants left for a different clinical trial and then returned –Pregnancy study –Some couples may never get pregnant Survival Analyses

Censoring If you create a timeline, a number of events will occur –Uncensored observation: The event under observation occurs and their time truly reflects the survival time –Censored observation: The events under observation does not occur and their time reflects a minimum survivor time –Random censoring: Loss to follow-up for random reasons –Right censoring: People who never have the event –Interval censoring: People who have missing data across a chunk of time –Subject dropped out and then rejoins –Left censoring: People who are lacking a good start time –Informative censoring: Some people may leave the study for reasons that relate to failure

Survival function S(t) = P(T > t): probability of surviving at least to time t Hazard function h(t)= lim  t  0 P(t  T { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/11/3248837/slides/slide_12.jpg", "name": "Survival function S(t) = P(T > t): probability of surviving at least to time t Hazard function h(t)= lim  t  0 P(t  T

Kaplan Meier Estimation of Survival Function

Cox Proportional Hazards model Analogous to other regressions such as linear or logistic regression * Based upon the hazard function h(t | X)= h 0 (t) exp(a + bX) * h(t | X=1) / h(t | X=0) = exp(b) exp(b) : the hazard ratio for a unit increase in X * Assume the hazard ratio is unchanged with respect to time

Gordon et al (1984) Coronary risk factors and exercise test performance in asymptomatic hypercholeserolemic men: Application of proportional hazards analysis AJ Epi 120(2):210-214

Model Specification, Fitting, and Selection Specification: What is the functional FORM of the relationship between Y and X E[Y | X0, X1] = a + b0*X0 + b1*X1 Fitting: Using data to estimate the various constants in the generic functional form of a model. Selection: You may be able to specify several reasonable models. Then the task is to select which of the models to emphasize or report. You may use the data to help select a model: * specify several models * fit the models to your data * examine the quality of the fit -- what this may be depends on how you plan to use the model and the method used to fit the model * Select a “good” model It may be difficult to interpret or trust p-values or effect estimates from models chosen by a selection procedure

Model Fitting There are many methods to fit statistical models: consider a simple model: Y = a + bX + e E[Y|X] = a + bX need to estimate “a” and “b” from data Least Squares: Find a’, b’ to minimize: sum( (yi – a’ –b’xi)^2 ) relatively fast does not depend on distribution of errors (e) Maximum Likelihood: assume a distribution for the errors (e) find a’, b’ to maximize: product( likelihood (yi – a’ – b’xi) ) equivalently, find a’, b’ so that: sum(derivative of log of likelihood() ) ) = 0 Estimating Equations: define a function similar to sum(derivative of log of likelihood() ) ) and find a’, b’ to set it equal to 0

Model Selection Models that fit data well will have lower values for sum( (yi – a’ –b’xi)^2 ): residual sums of squares (RSS) sum of squared errors (SSE) or greater likelihoods or log-likelihoods relative to models that fit the data less well. Adding covariates to a model will lower the RSS or increase the log- likelihood Adding irrelevant covariates will improve the model fit to the data, but probably not by much, while decreasing the ability of the model to describe a replicate dataset or the population from which data are collected. This is the over-fitting problem. Model selection is a tradeoff between fitting the data well and over-fitting the data. Likelihood ratio tests can help determine if additional covariates help (fit data) more than they hurt (over-fitting)

Model Critiques The ultimate test of a model’s worth may be using it to make predictions about a new dataset (not the one used to fit the model). With new data, the quality of the predictions can be assessed. This may not be possible, but there are methods to approximate the Ideal confirmation study for a model: * cross validation fit the model using some of the data and assess predictive ability on the remaining data * Bootstrapping from a dataset with n observations, draw n observations with replacement to get a “new” dataset. Analyze that dataset. Draw another “new” dataset and analyze it. Assess how similar the analyses are

Intervention effects and regression Intervention effects: E[ Y | set(X=x1), Z=z] - E[Y | set(X=x0), Z=z] E[ Y | set(X=x1), Z=z] / E[Y | set(X=x0), Z=z] where the expectation is over the target population

Intervention effects and regression Intervention effects: E[ Y | set(X=x1), Z=z] - E[Y | set(X=x0), Z=z] E[ Y | set(X=x1), Z=z] / E[Y | set(X=x0), Z=z] where expectation is over target population In practice, what we can calculate with standard regression analysis is: Ave(Y | X=x1, Z=z) - Ave(Y | X=x0, Z=z') Ave(Y | X=x1, Z=z) / Ave(Y | X=x0, Z=z') or equivalently: E[ Y | X=x1, Z=z] - E[Y | X=x0, Z=z’] E[ Y | X=x1, Z=z] / E[Y | X=x0, Z=z’] where the expectation is over the sample

Intervention effects and regression If we want to use the regression association measures as estimates of the potential intervention effects, we need to assume: E[ Y | X=x, Z=z] = E[ Y | set(X=x), Z=z] No Confounding Assumption “no residual confounding of X and Y given Z"

Intervention effects and regression If we want to use the regression association measures as estimates of the potential intervention effects, we need to assume: E[ Y | X=x, Z=z] = E[ Y | set(X=x), Z=z] No Confounding Assumption “no residual confounding of X and Y given Z” There are some methods we can use to push harder to remove residual confounding than with basic regression: regularization treatment models, propensity scores, IPW both of the above: double robust estimates

Regression standardization E[ Y | X=x, Z=z] different values of Z correspond to different strata in which you may consider the Y~X association You can define a overall measure of the Y~X association by taking a weighted average over the different strata or levels of Z resulting in a marginal or population averaged effect: E W [Y | X=x] = Σ {z in Z} ( w(z) * E[Y | X=x, Z=z] ) Different choices for weights w(z): w(z) = proportion of Z=z in source population... or in a different target population or in a standard population

Exposure Scores An outcome regression model describes the expected value of the outcome Y given the treatment or exposure of interest, X, and other covariates or confounders Z: E[ Y | X=x, Z=z] We could make a model to describe the expected value of X given other covariates Z: E[X | Z=z] Once the second model is fit, we can calculate the probability that each subject in the study should have received a particular exposure, say X=1: Pr(X=1 | Z=z)

Exposure Scores If we have a dichotomous exposure of interest (X= 1 or X=0), then Pr(X=1 | Z=z) would be called the propensity score. If we include the propensity score as a covariate in the outcome model, then we would effectively be stratifying the analysis by the probability of exposure, so confounding would be broken Alternatively, for subjects with X=1, we could calculate pi1 = Pr(X=1 | Z=z) and give them weights 1/pi1. For subjects with X=0, we could calculate pi0 = Pr(X=0 | Z=z), and give them weights 1/pi0. Then we fit a weighted regression model we would have a model further breaks confounding by accounting for the population distribution of exposure. If we combine these methods with standardization, we get a “double robust” estimator of the confounding-free effect of interest

Ecological Studies Sample units are groups or regions rather than individuals - use aggregate measures of exposure and outcome -- rates, proportions, regional averages, representative values - if using spatial regions, expect spatial correlations -- use analysis methods that do not require independent observations -- GEEs, Random Effects, Hierarchical Models - different sized regions or groups have different data quality or completeness -- small regions: sparse measurements

Ecological Studies

Group level and individual level trends may differ Ecological Fallacy: not appreciating this

Ecological Bias Confounding by group Effect modification by group Plus, all of the opportunities for bias as in individual level studies Mitigation strategy: use small and well defined groups that are homogeneous with respect to exposures

Generalized Linear Models A broad class of models (including linear, logistic, and Poisson regression): The distribution of the outcome Y has a special form “Exponential dispersion family” There is a linear model for a transformed version of the expected value of Y – a “mean function” g(E[Y|X] ) = Xβ where g() is a “link function” The variance of Y can be expressed as a function of the expected value of Y Var(Y|X) = V(g -1 (Xβ ) ) There are general methods to solve many forms of these models and extensions of these models

Methods For Non-Independent Observations Generalized Estimating Equations Extensions of Generalized Linear Models, where you assume that the OBSERVATIONS have a particular correlation structure Random Effect Models (1) Y = a + bX + e : standard linear model, common intercept common slope (2)Y = ai + bX + e : standard linear model, different intercepts for each group i (3)Y = a + gi + bX + e: each group i has its own slope, but those slopes are drawn from a normal distribution with mean 0. (3) Is more flexible than (1) and may have many fewer parameters than (2)

Download ppt "If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/"

Similar presentations