Variable selection in Regression modelling Simon Thornley.

Slides:



Advertisements
Similar presentations
A Tutorial on Learning with Bayesian Networks
Advertisements

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
M2 Medical Epidemiology
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Using causal graphs to understand bias in the medical literature.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Chance, bias and confounding
Causal Networks Denny Borsboom. Overview The causal relation Causality and conditional independence Causal networks Blocking and d-separation Excercise.
Integrated Systems Understanding using Bayesian Networks: Measuring the Effectiveness of a Weapon System Pretoria Defence, Peace, Safety & Security Alta.
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
FINAL REVIEW BIOST/EPI 536 December 14, Outline Before the midterm: Interpretation of model parameters (Cohort vs case-control studies) Hypothesis.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
1 gR2002 Peter Spirtes Carnegie Mellon University.
Today Logistic Regression Decision Trees Redux Graphical Models
Variables and Measurement (2.1) Variable - Characteristic that takes on varying levels among subjects –Qualitative - Levels are unordered categories (referred.
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Covariate Selection for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Association vs. Causation
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
1 PH 240A: Chapter 8 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)
Unit 6: Standardization and Methods to Control Confounding.
Summary of the Bayes Net Formalism David Danks Institute for Human & Machine Cognition.
A Brief Introduction to Graphical Models
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Non-experimental Quantitative Research Designs (NEQDs)
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
Introduction to confounding and DAGs
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
V13: Causality Aims: (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Announcements Project 4: Ghostbusters Homework 7
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS.
INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Assessing the Total Effect of Time-Varying Predictors in Prevention Research Bethany Bray April 7, 2003 University of Michigan, Dearborn.
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The.
Lecture 2: Statistical learning primer for biologists
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Machine Learning – Lecture 11
Chapter Two Methods in the Study of Personality. Gathering Information About Personality Informal Sources of Information: Observations of Self—Introspection,
Introduction on Graphic Models
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
Producing Data 1.
Using Directed Acyclic Graphs (DAGs) to assess confounding Glenys Webster & Anne Harris May 14, 2007 St Paul’s Hospital Statistical “Rounds”
Methods of Presenting and Interpreting Information Class 9.
CS 2750: Machine Learning Directed Graphical Models
URBDP 591 A Lecture 10: Causality
Epidemiology 503 Confounding.
Read R&N Ch Next lecture: Read R&N
Learning Bayesian Network Models from Data
Confounding and Effect Modification
Explanation of slide: Logos, to show while the audience arrive.
Read R&N Ch Next lecture: Read R&N
CS 188: Artificial Intelligence
DAGs intro without exercises 1h Directed Acyclic Graph
CS 188: Artificial Intelligence Spring 2007
Read R&N Ch Next lecture: Read R&N
Counterfactual models Time dependent confounding
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

Variable selection in Regression modelling Simon Thornley

Which variables should we adjust for and why? Caution: mass confusion

StatisticsCausal analysis Statistics and causation  Assesses parameters of a distribution from samples.  Infers associations  Estimate probabilities of past and future events...  If... experimental conditions remain the same.  Infers probabilities under conditions that are changing  e.g. treatments or interventions

Variable selection  Based on relationship with outcome variable (p-value)  fit of data to model (likelihood); joint probability of data| model  What about causal relationships between variables?

“I compute, therefore I am.”

Pearl and causation? Probability theory  Limits of probability theory:  “What is the probability it rained if the grass is wet?” P(Rain | Grass wet) Causal approach  “What is the probability it rained if we make the grass wet?”  P(Rain| do (Grass wet)) = P(Rain) RainGrass wet

Simpson’s paradox, 1899  “Any statistical relationship between two variables may be reversed by including additional factors in the analysis”  Reverse regression, compare:  Men earn more than equally qualified women.  Men more qualified than equally paid women.  Which factors should be adjusted for?

A visual depiction

Gender bias at Berkeley?

A DAG explanation GenderAdmission Faculty competitiveness women tended to apply to competitive departments with low rates of admission, whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants.

DAG: A method for variable selection  Graphic: A picture of nodes (variables) and arcs or edges  Directed: causal effects shown  Acyclic: No arrows from descendants to ancestors  E- Exposure  D-Disease  S- Stratification factors

DAG in the inferential process... Joint distribution Data generating model (DAG), M Data Aspects of M Q(M) Inference How does natural process (unknown) assign values to variables in the analysis?

DAG Terminology  Path: sequence of arrows connecting two variables, ignoring direction  E S D  A collider is a variable which has two or more arrows pointing to (colliding with) it.  E S D  A path is blocked if it contains colliders, otherwise ‘directed path’  An unblocked path transmits associations along it.  E S D or  E S D

Descendant  Any node at the end of a directed path originating at E, is called a descendant of E.  Similarly, parents  Assumption (no line = assumptions)  Any node is independent of all other non-descendents, given parents.

Why use DAGs?  Encodes expert knowledge  Make assumptions about research question explicit; allow debate  Link causal to statistical model for causal inference  Make us think, “What could give rise to an observed association between E and Y?”

Explaining observed associations  E and D share a common cause (confounding)  Induced by conditioning on common effect of E and D (selection bias, collider). ExposureDisease ExposureDisease Strata ExposureDisease Strata, such as hospitalisation

Danger: controlling for colliders Exposure: sugar Outcome: flouride Collider: tooth decay Among individuals with tooth decay, if we know someone was exposed to fluoride in the water, we are more likely to believe that their tooth decay is due to sugar. Spurious association

Simple rules to choose confounders  Delete all arrows from E that point to any descendant.  In the new graph determine if there are any unblocked backdoor paths from E to D  The set of confounders S allows one to make the assumption that  P(D=d | do (E=e), S=s) = P ( D=d | E=e, S=s)

A worked example: Urate and CVD Collider Smoking Adjust Exposure Outcome

Usual (washing machine) approach

Summary  Variable selection is complex  Need to consider causal paths  Adjustment can cause more harm than good  Don’t adjust for variables on causal path  Adjust for variables that likely to ‘cause’ exposure and disease  Avoid adjustment for variables with many causes (colliders).