Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Complex Survey Data Day 3: Regression.

Similar presentations


Presentation on theme: "Analysis of Complex Survey Data Day 3: Regression."— Presentation transcript:

1 Analysis of Complex Survey Data Day 3: Regression

2 Today’s schedule Part I: Basic review of common regressions and when to use them PART II: Introduction to – PROC REGRESS – PROC RLOGIST – PROC LOGLINK – PROC MULTILOG

3 Regression Typically in epidemiologic research, our outcomes fall into four major types: – Continuous Normally distributed Skewed – Counts – Binary – Ordinal – Nominal

4 Continuous outcome, normally distributed Linear regression

5 Continuous outcome, right skewed Poisson regression

6 Counts Poisson regression

7 Binary outcome Logistic regression

8 Ordinal Polytomous regression, cumulative logit link function Likert scales Ordered categorical scales (age, income) The cumulative logit link function assumes that the effect of going from 1 to 2 is the same as the effect of going from 2 to 3

9 Nominal Polytomous regression, general logit link function Race Diagnosis (depression versus anxiety versus substance use disorder) The general logit link function gives a different estimate for the effect of going from 1 to 2 and the effect of going from 2 to 3

10 Categorizing your exposure Check assumptions regarding the functional form of the relationship between the exposure and the outcome – E.g., relationship between age and alcohol use disorders. We would not want to enter age as a continuous variable because we do not think age is linearly related to risk of alcohol use disorders If you decide to categorize a continuous variable, decision on cutpoints can best be made if there is literature precedent – Relying on data driven cutpoints will make your work incomparable with other work in the literature If there is no precedent: – Use quartiles or – Break up the exposure into small categories, and examine the relationship with the outcome in a regression model with no predictors (on the log scale if using logistic regression).

11 Choosing covariates Most important: DO NOT SKIP THE GOUNDWORK! – Check associations with exposure and outcome – Check associations among covariates – Categorize the covariates appropriately When should something be evaluated as a moderator, and when should it be a confounder/covariate? – Most of the time, it is clear: do you think that the relationship between exposure and outcome will be the same across levels of the third variable, or do you think it will be different? – If you do not have an a priori hypothesis and are just trying to build a solid statistical model, try as a moderator first. If significant, leave in as a moderator. – Because interaction terms are sometimes difficult to interpret on their own, think about just creating subset statistical models.

12 LAB 3: Regression in SUDAAN


Download ppt "Analysis of Complex Survey Data Day 3: Regression."

Similar presentations


Ads by Google