Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the.

Similar presentations


Presentation on theme: "BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the."— Presentation transcript:

1 BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the “true” but unobservable relationship of the covariate and outcome  “Begin to model, begin to err.”  “All models are wrong, but some are useful” (GP Box) Purpose of the model may dictate the approach  Demonstration of a new risk factor or new effective treatment (A priori scientific hypothesis & statistical plan)  Find the best prediction model using known risk factors (may be more empirical) & validate

2 BIOST 536 Lecture 2 2 Modeling Linear model (SBP is continuous)  where  is assumed to follow a normal distribution with mean zero and variance  2  are all fixed parameters to be estimated Logistic model  High blood pressure (Y=1) versus not (Y=0)  where Y is assumed to follow a binomial distribution with mean  and variance  ( 1-  ) What are the similarities and the dissimilarities between these two models?

3 BIOST 536 Lecture 2 3 Theoretical Models Based on a mathematical model for the assumed underlying biological process  Functional form Pr ( Y=y| X=x ) dictated by the model  Good fit of the model to data supports the model  Extrapolation may be justified if the model is supported  Model form must be specific enough to be testable

4 BIOST 536 Lecture 2 4 Theoretical Model Example 1 Radiobiology: Low dose of ionizing radiation on leukemia incidence rate (Preston & Pierce, Radiation Effects Research Foundation)  Dose effect acts multiplicatively on the incidence rate  Dose is modeled as linear in dose  Covariates (z) can modify the effect of dose  Very small values for dose have been used to set radiation exposure standards  Specificity of the model form can be tested by comparing to alternative model forms

5 BIOST 536 Lecture 2 5 Theoretical Model Example 2 Multistage model of carcinogenesis (Armitage & Doll, 1954; Moolgavkar & Knudsen, 1981)  Cancer arises in a single cell that passes through k ordered stages  Transition from one stage to the next does not depend on age (t), but can be affected by carcinogens  For t << ∞ (cancer being rare), age incidence is  So log incidence should be linear in log (age)  Theory is supported by age-incidence rates for adult epithelial cancers in the absence of birth cohort effects

6 BIOST 536 Lecture 2 6

7 7 Theoretical Model Example 3 Genetic epidemiology: single locus model for disease susceptibility  Two autosomal alleles (A, a)  Three genotypes (AA, Aa, aa )  Penetrance = Pr( disease|genotype ) = P genotype  Model: P AA > P aa ≥ 0 and P Aa = (1-  ) P aa +  P AA  Parameter  indicates mode of inheritance   estimated from the data and may indicate which mode of inheritance is correct  = 0 quasi-recessive P AA > P Aa = P aa  = 1 quasi-dominant P AA = P Aa > P aa  = 0.5 additive co-dominant P Aa = 0.5 * ( P Aa + P aa )

8 BIOST 536 Lecture 2 8 Empirical Models Model linking the outcome to covariates is chosen from several competing models on the basis of the data and hypothesis testing Model may provide insight into the scientific mechanisms, confounding effects, and effect modifiers Choice of models limited by assumed functional form (e.g. logistic model often chosen for mathematical convenience) Parsimonious models preferred Inference restricted to the observed ranges of predictor variables

9 BIOST 536 Lecture 2 9 Empirical Models Hypothesis driven  Data collection designed to answer a specific question Randomized trial – which treatment is better? Case-control study – assess a specific exposure  Scientific rationale for the study a priori  Statistical analysis plan done in advance  Results generally more convincing

10 BIOST 536 Lecture 2 10 Empirical Models Hypothesis generation  Study uses existing data collected for other purposes (e.g. administrative data; previous cohort, etc.)  Scientific rationale for studying particular variables may be a priori  Statistical testing of more variables  Results may be less convincing  Need validation studies (especially external)

11 BIOST 536 Lecture 2 11 Empirical Models Disease risk prediction models  Prediction of the outcome is the goal; discovery of new risk factors is not the goal Prediction of disease in a particular population  Usually prospective data; not case-control data  Risk factors are already known, but how they are modeled may require exploration  Simplicity is valued by the end user  Model chosen should minimize errors of prediction (sensitivity, specificity, c-statistic)  Internal and external validation are absolutely necessary

12 BIOST 536 Lecture 2 12 Empirical Model Example 1 Gail risk prediction model for breast cancer http://www.cancer.gov/bcrisktool/

13 BIOST 536 Lecture 2 13 Empirical Model Example 2 Bladder cancer incidence and smoking (Howe, 1980) Model is for the log OR for bladder cancer as a function of log(# cigarettes per day + 1) Mostly linear relationship with slope ~ 0.5 per log unit Effect of smoking the first 10 cigarettes > next 10 cigarettes Dose response relationship may suggest causality (Bradford Hill)

14 BIOST 536 Lecture 2 14 Empirical Model Example 3 Case-control study of esophageal cancer and alcohol consumption in France (Breslow & Day, Volume 1) Age groupCase status80 gr +0-79 gr 25-34110 09106 35-44145 026164 45-5412521 029138 55-6414234 027139 65-7411936 01888 75+158 0031 Mantel-Haenszel estimate of the OR relating esophageal cancer to 80 grams of alcohol per day stratified by age group Exact estimate of the OR stratified by age group using permutation distributions

15 BIOST 536 Lecture 2 15 Empirical Model Example 3 continued May be confounded with tobacco exposure Dose-response relationship may be more likely to suggest causality Full case-control study  200 incident esophageal cancer cases in males found in the cancer registry  774 male controls sampled from voter registrations  Alcohol categorized as (0-24, 25-49, 50-74, 75-99, 100+ g/day); tobacco (0, 1-4, 5-14, 15-29, 30+ g/day)  Linear logistic model fit to the data  Model assumes that alcohol/tobacco act multiplicatively

16 BIOST 536 Lecture 2 16 Empirical Model Example 4 Risk prediction model – want to predict pneumonia in patients presenting with acute cough (Diehr et al, 1984) Create a score based on clinical characteristics Score ranges from -3 to 7; model probability of pneumonia  Could we use empirical weights instead?  What is the disadvantage with using empirical weights?

17 BIOST 536 Lecture 2 17 Empirical Model Example 4 continued Still need cutoff for deciding on pneumonia – independent validation dataset Sensitivity (probability that existing pneumonia has a score above the cutoff) Specificity (probability that non-existing pneumonia has a score at or below the cutoff) ScoreSensitivitySpecificity -3100%8% -291%40% 74%70% 059%88% 133%96% 220%99% 311%99% 44%100%

18 BIOST 536 Lecture 2 18 Modeling Summary Model establishes a relationship between the outcome and one or more covariates Models can be based on 1. Underlying mathematical model 2. Expected scientific relationship 3. Discovered scientific relationship 4. Empirical findings Models without a clear a priori basis may need validation


Download ppt "BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the."

Similar presentations


Ads by Google