BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the.

Slides:



Advertisements
Similar presentations
Agency for Healthcare Research and Quality (AHRQ)
Advertisements

Study Objectives and Questions for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Comparator Selection in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Controlling for Time Dependent Confounding Using Marginal Structural Models in the Case of a Continuous Treatment O Wang 1, T McMullan 2 1 Amgen, Thousand.
Regression and correlation methods
Deriving Biological Inferences From Epidemiologic Studies.
Epidemiologic study designs
Departments of Medicine and Biostatistics
Independent & Dependent Variables
Case-Control Studies (Retrospective Studies). What is a cohort?
Sensitivity Analysis for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
The burden of proof Causality FETP India. Competency to be gained from this lecture Understand and use Doll and Hill causality criteria.
Chance, bias and confounding
What is a sample? Epidemiology matters: a new introduction to methodological foundations Chapter 4.
Lecture 17: Regression for Case-control Studies BMTRY 701 Biostatistical Methods II.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
BIOST 536 Lecture 9 1 Lecture 9 – Prediction and Association example Low birth weight dataset Consider a prediction model for low birth weight (< 2500.
BIOST 536 Lecture 12 1 Lecture 12 – Introduction to Matching.
Personality, 9e Jerry M. Burger
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
A Longitudinal Study of Maternal Smoking During Pregnancy and Child Height Author 1 Author 2 Author 3.
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Are exposures associated with disease?
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Cohort Study.
HSTAT1101: 27. oktober 2004 Odd Aalen
Multiple Choice Questions for discussion
Lecture 8 Objective 20. Describe the elements of design of observational studies: case reports/series.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 7: Gathering Evidence for Practice.
● Midterm exam next Monday in class ● Bring your own blue books ● Closed book. One page cheat sheet and calculators allowed. ● Exam emphasizes understanding.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Linear correlation and linear regression + summary of tests
Lecture 7 Objective 18. Describe the elements of design of observational studies: case ‑ control studies (retrospective studies). Discuss the advantages.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Epidemiologic design from a sampling perspective Epidemiology II Lecture April 14, 2005 David Jacobs.
Adaptive randomization
BIOST 536 Lecture 11 1 Lecture 11 – Additional topics in Logistic Regression C-statistic (“concordance statistic”)  Same as Area under the curve (AUC)
Types of study designs.
Causal relationships, bias, and research designs Professor Anthony DiGirolamo.
BIOST 536 Lecture 1 1 Lecture 1 - Introduction Overview of course  Focus is on binary outcomes  Some ordinal outcomes considered Simple examples Definitions.
Simple linear regression Tron Anders Moger
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Epidemiological Research. Epidemiology A branch of medical science that deals with the incidence, distribution, and control of disease in a population.
XIAO WU DATA ANALYSIS & BASIC STATISTICS.
Descriptive study design
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
Design of Clinical Research Studies ASAP Session by: Robert McCarter, ScD Dir. Biostatistics and Informatics, CNMC
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
Meta-analysis of observational studies Nicole Vogelzangs Department of Psychiatry & EMGO + institute.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
BINARY LOGISTIC REGRESSION
Statistical Modelling
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Type I and Type II Errors
Presentation transcript:

BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the “true” but unobservable relationship of the covariate and outcome  “Begin to model, begin to err.”  “All models are wrong, but some are useful” (GP Box) Purpose of the model may dictate the approach  Demonstration of a new risk factor or new effective treatment (A priori scientific hypothesis & statistical plan)  Find the best prediction model using known risk factors (may be more empirical) & validate

BIOST 536 Lecture 2 2 Modeling Linear model (SBP is continuous)  where  is assumed to follow a normal distribution with mean zero and variance  2  are all fixed parameters to be estimated Logistic model  High blood pressure (Y=1) versus not (Y=0)  where Y is assumed to follow a binomial distribution with mean  and variance  ( 1-  ) What are the similarities and the dissimilarities between these two models?

BIOST 536 Lecture 2 3 Theoretical Models Based on a mathematical model for the assumed underlying biological process  Functional form Pr ( Y=y| X=x ) dictated by the model  Good fit of the model to data supports the model  Extrapolation may be justified if the model is supported  Model form must be specific enough to be testable

BIOST 536 Lecture 2 4 Theoretical Model Example 1 Radiobiology: Low dose of ionizing radiation on leukemia incidence rate (Preston & Pierce, Radiation Effects Research Foundation)  Dose effect acts multiplicatively on the incidence rate  Dose is modeled as linear in dose  Covariates (z) can modify the effect of dose  Very small values for dose have been used to set radiation exposure standards  Specificity of the model form can be tested by comparing to alternative model forms

BIOST 536 Lecture 2 5 Theoretical Model Example 2 Multistage model of carcinogenesis (Armitage & Doll, 1954; Moolgavkar & Knudsen, 1981)  Cancer arises in a single cell that passes through k ordered stages  Transition from one stage to the next does not depend on age (t), but can be affected by carcinogens  For t << ∞ (cancer being rare), age incidence is  So log incidence should be linear in log (age)  Theory is supported by age-incidence rates for adult epithelial cancers in the absence of birth cohort effects

BIOST 536 Lecture 2 6

7 Theoretical Model Example 3 Genetic epidemiology: single locus model for disease susceptibility  Two autosomal alleles (A, a)  Three genotypes (AA, Aa, aa )  Penetrance = Pr( disease|genotype ) = P genotype  Model: P AA > P aa ≥ 0 and P Aa = (1-  ) P aa +  P AA  Parameter  indicates mode of inheritance   estimated from the data and may indicate which mode of inheritance is correct  = 0 quasi-recessive P AA > P Aa = P aa  = 1 quasi-dominant P AA = P Aa > P aa  = 0.5 additive co-dominant P Aa = 0.5 * ( P Aa + P aa )

BIOST 536 Lecture 2 8 Empirical Models Model linking the outcome to covariates is chosen from several competing models on the basis of the data and hypothesis testing Model may provide insight into the scientific mechanisms, confounding effects, and effect modifiers Choice of models limited by assumed functional form (e.g. logistic model often chosen for mathematical convenience) Parsimonious models preferred Inference restricted to the observed ranges of predictor variables

BIOST 536 Lecture 2 9 Empirical Models Hypothesis driven  Data collection designed to answer a specific question Randomized trial – which treatment is better? Case-control study – assess a specific exposure  Scientific rationale for the study a priori  Statistical analysis plan done in advance  Results generally more convincing

BIOST 536 Lecture 2 10 Empirical Models Hypothesis generation  Study uses existing data collected for other purposes (e.g. administrative data; previous cohort, etc.)  Scientific rationale for studying particular variables may be a priori  Statistical testing of more variables  Results may be less convincing  Need validation studies (especially external)

BIOST 536 Lecture 2 11 Empirical Models Disease risk prediction models  Prediction of the outcome is the goal; discovery of new risk factors is not the goal Prediction of disease in a particular population  Usually prospective data; not case-control data  Risk factors are already known, but how they are modeled may require exploration  Simplicity is valued by the end user  Model chosen should minimize errors of prediction (sensitivity, specificity, c-statistic)  Internal and external validation are absolutely necessary

BIOST 536 Lecture 2 12 Empirical Model Example 1 Gail risk prediction model for breast cancer

BIOST 536 Lecture 2 13 Empirical Model Example 2 Bladder cancer incidence and smoking (Howe, 1980) Model is for the log OR for bladder cancer as a function of log(# cigarettes per day + 1) Mostly linear relationship with slope ~ 0.5 per log unit Effect of smoking the first 10 cigarettes > next 10 cigarettes Dose response relationship may suggest causality (Bradford Hill)

BIOST 536 Lecture 2 14 Empirical Model Example 3 Case-control study of esophageal cancer and alcohol consumption in France (Breslow & Day, Volume 1) Age groupCase status80 gr gr Mantel-Haenszel estimate of the OR relating esophageal cancer to 80 grams of alcohol per day stratified by age group Exact estimate of the OR stratified by age group using permutation distributions

BIOST 536 Lecture 2 15 Empirical Model Example 3 continued May be confounded with tobacco exposure Dose-response relationship may be more likely to suggest causality Full case-control study  200 incident esophageal cancer cases in males found in the cancer registry  774 male controls sampled from voter registrations  Alcohol categorized as (0-24, 25-49, 50-74, 75-99, 100+ g/day); tobacco (0, 1-4, 5-14, 15-29, 30+ g/day)  Linear logistic model fit to the data  Model assumes that alcohol/tobacco act multiplicatively

BIOST 536 Lecture 2 16 Empirical Model Example 4 Risk prediction model – want to predict pneumonia in patients presenting with acute cough (Diehr et al, 1984) Create a score based on clinical characteristics Score ranges from -3 to 7; model probability of pneumonia  Could we use empirical weights instead?  What is the disadvantage with using empirical weights?

BIOST 536 Lecture 2 17 Empirical Model Example 4 continued Still need cutoff for deciding on pneumonia – independent validation dataset Sensitivity (probability that existing pneumonia has a score above the cutoff) Specificity (probability that non-existing pneumonia has a score at or below the cutoff) ScoreSensitivitySpecificity -3100%8% -291%40% 74%70% 059%88% 133%96% 220%99% 311%99% 44%100%

BIOST 536 Lecture 2 18 Modeling Summary Model establishes a relationship between the outcome and one or more covariates Models can be based on 1. Underlying mathematical model 2. Expected scientific relationship 3. Discovered scientific relationship 4. Empirical findings Models without a clear a priori basis may need validation