Causal reasoning in Biomedical Informatics

Slides:



Advertisements
Similar presentations
Bayesian network for gene regulatory network construction
Advertisements

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Causal Discovery from Medical Textual Data Subramani Mani and Gregory F. Cooper.
1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
A Tutorial on Learning with Bayesian Networks
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Lauritzen-Spiegelhalter Algorithm
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Learning Causality Some slides are from Judea Pearl’s class lecture
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Bayes Nets Rong Jin. Hidden Markov Model  Inferring from observations (o i ) to hidden variables (q i )  This is a general framework for representing.
Biological Gene and Protein Networks
CSE 571 Advanced Artificial Intelligence Nov 24, 2003 Class Notes Transcribed By: Jon Lammers.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Simulation and Application on learning gene causal relationships Xin Zhang.
6. Gene Regulatory Networks
1 gR2002 Peter Spirtes Carnegie Mellon University.
Bayesian Networks Alan Ritter.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Bayes Net Perspectives on Causation and Causal Inference
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Directed - Bayes Nets Undirected - Markov Random Fields Gibbs Random Fields Causal graphs and causality GRAPHICAL MODELS.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
V13: Causality Aims: (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Announcements Project 4: Ghostbusters Homework 7
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
1 Acceleration of Inductive Inference of Causal Diagrams Olexandr S. Balabanov Institute of Software Systems of NAS of Ukraine
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
Introduction on Graphic Models
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
Variable selection in Regression modelling Simon Thornley.
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
1 Department of Engineering, 2 Department of Mathematics,
A Bayesian Approach to Learning Causal networks
A Short Tutorial on Causal Network Modeling and Discovery
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Pattern Recognition and Image Analysis
CAP 5636 – Advanced Artificial Intelligence
An Algorithm for Bayesian Network Construction from Data
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Spring 2007
CS 188: Artificial Intelligence Spring 2006
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

Causal reasoning in Biomedical Informatics Chitta Baral Professor Department of Computer Science and Engineering chitta@asu.edu

Causal connection versus Correlation Rain, Falling Barometer: They usually have a 1-1 correspondence Does falling barometer cause rain? Does rain cause falling barometer? Rain and Mud: Does rain cause mud? Smoking and Cancer: Does smoking causes cancer? What causes global warming?

Simpson’s Paradox: Who should take drug? (Male, Female Unknown Sex) Recovery ~Recovery #people Rec. rate M Took D 18 12 30 60 M ~Take 7 3 10 70 F Took D 2 8 20 F ~Take 9 21 Total Took 40 50 T ~Take 16 24

Simpson’s Paradox (cont.) Summary 60% of males who took drug recovered 70% of males who did not take drug recovered 20% of females who took drug recovered 30% of females who did not take drug recovered 50% of people who took drug recovered 40% of people who did not take drug recovered Paradox: If we know a patient is male or a female then we should not give the drug! If we do not know the sex then we should give the drug!

Why the Simpson’s Paradox From the given data we can calculate the following P(recovery|took drug, male) P(recovery|took drug, female) P(recovery|too drug) We should be calculating the following P(recovery| do(drug), male) P(recovery| do(drug), female) P(recovery| do(drug) )

Causality: story and questions Story: In a group of people, 50 % were given treatment for an ailment and 50% were not. Of the 50 % in both the treated and untreated group, 50 % recovered and 50 % did not. Joe, a patient took the treatment and died. What is the probability that Joe’s death occurred due to treatment? Or, what is the probability that Joe, who died under treatment, would have lived had he not been treated? Can we answer these questions from the above story? No? Why not?

The probability distribution X Y Prob. 0.25 1 X =1 treatment was given X = 0 treatment was not given Y = 1 patient died Y = 0 patient recovered

Causal Model 1 of the story The model U1: a variable that decides treatment; U2: a variable that decides if someone will die X, Y: treatment given, patient died X  U1 (X = U1) Y  U2 (Y = U2) P(U1) = 0.5 P(U2) = 0.5. leads to the same probability table Observation: Joe took the treatment and died. X = 1 and Y = 1. Thus U1 = 1 and U 2 =1. Question: What is the probability that Joe would have lived if he had not taken the treatment. We do X= 0. Find P( Y = 0 | do(X=0))? Y = U2 = 1 (regardless of the value of X and U1) Hence P( Y = 0 | do(X=0)) = 0. (Joe would have died anyway.)

Causal Model 2 of the story The model U2: A genetic factor which if present, kills people who take the treatment and if absent kills people who do not take the treatment U1, X, Y: decides treatment, treatment given, patient died X  U1 (X = U1) Y  U2 Y  X (Y = X*U2 + (1-X)(1-U2) ) P(U1) = 0.5 P(U2) = 0.5. leads to the same probability table Observation: Joe took the treatment and died. X = 1 and Y = 1. Thus U1 = 1 and U 2 =1. Question: What is the probability that Joe would have lived if he had not taken the treatment. We do X= 0. Find P( Y = 0 | do(X=0))? Y = 0 * 1 + (1-0)(1-1) = 0 Hence P( Y = 0 | do(X=0)) = 1. (Joe would not have died.)

Summary of the story There are at least two causal models which is consistent with the data (50% of …) In model 1 Joe would still have died if he had not taken the treatment. But in model 2 Joe would have lived if he had not taken the treatment. Moral: Causal models are the key. Just statistical data is not much useful. Multiple causal models may correspond to the same statistical data.

Causal relations in molecular biology Certain proteins (transcription factors) regulate the expression of genes One protein may inhibit or activate another protein or another biochemical molecule Catalysts in metabolic reactions

GRNs are remarkably diverse in their structure, but several basic properties are illustrated in this figure. In this example, two different signals impinge on a single target gene where the cis-regulatory elements provide for an integrated output in response to the two inputs. Signal molecule A triggers the conversion of inactive transcription factor A (green oval) into an active form that binds directly to the target gene's cis-regulatory sequence. The process for signal B is more complex. Signal B triggers the separation of inactive B (red oval) from an inhibitory factor (yellow rectangle). B is then free to form an active complex that binds to the active A transcription factor on the cis-regulatory sequence. The net output is expression of the target gene at a level determined by the action of factors A and B. In this way, cis-regulatory DNA sequences, together with the proteins that assemble on them, integrate information from multiple signaling inputs to produce an appropriately regulated readout. Sources: http://www.ornl.gov/sci/techresources/Human_Genome/graphics/slides/images/REGNET.jpg

How do we get causal models? Traditional approach Knockout genes (too slow) Temporal or time series data Not feasible for human cells Can we infer causal information from steady state data? To some extent

Suppose we have 3 variables A, B, and C obtained from the data that: A and B are dependent. B and C are dependent. A and C are independent. Think of A, B and C that satisfy the above.

Example (cont.) Most likely your interpretation of A, B and C would satisfy the causal relations A  B  C as shown below. A C B

Some necessary definitions Necessary to state when the algorithm works.

Causal model Causal structure: a directed acyclic graph (DAG) Causal model: Causal structure with parameters (functions for each variables with parents, and probabilities for the variables without parents)

Conditional independence and d-separation X and Y are said to be conditionally independent given Z if P(x | y, z) = P(x | z) whenever P(y, z) > 0. d-separation: A path p is said to be d-separated by a set of nodes Z if p contains i  m  j or i  m  j and m is in Z or p contains i  m  j and neither m nor any of its descendant is in Z. Z is said to d-separate X and Y if every path between a node in X and a node in Y is d-separated by Z

Observationally equivalent Two directed acyclic graphs (DAGs) are observationally equivalent if they have the same set of independencies. Alternative Definition: Two DAGs are observationally equivalent if they have the same skeleton and the same set of v-structures V-structures are structures of the form a  x  b such that there is no arrow between a and b.

Observationally equivalent networks Two networks that are observationally equivalent can not be distinguished without resorting to manipulative experimentation or temporal information.

Preference Ordering between DAGs: G1 is preferable to G2, if every distribution that can be simulated using G1 (and some parameter) can also be simulated using G2 (and some parameter). In the absence of hidden variables, tests of preference and (observational) equivalence can be reduced to tests of induced dependencies, which can be determined directly from the topology of the DAG without considering about the parameters.

Stability/faithfulness Stability/faithfulness: A DAG and distribution are faithful to each other if they exhibit the same set of independencies. A distribution is said to be faithful if it is faithful to some DAG. With the added assumption of stability, every distribution has a unique minimal causal structure (up to d-separation equivalence), as long as there are no hidden variables.

IC algorithm and faithfulness Given a faithful distribution the IC and IC* algorithms can find the set of DAGs that are faithful to this distribution, in absence and in presence of hidden variables, respectively

IC Algorithm: Step 1 For each pair of variables a and b in V, search for a set Sab such that (a╨b | Sab) holds in P – in other words, a and b should be independent in P, conditioned on Sab . Construct an undirected graph G such that vertices a and b are connected with an edge if and only if no set Sab can be found. Sab  Sab a b a b ╨ a Not  Sab b

IC Algorithm: Step 2 For each pair of nonadjacent variables a and b with a common neighbor c, check if c Sab. If it is, then continue; Else add arrowheads at c i.e a→ c ← b c a b Yes a b C ╨ No c a b

Microarray data Genes Samples Gene up-regulate, down-regulate;

Our work on learning causal models We developed an algorithm for learning causal relationship with knowledge of topological ordering information; Uses conditional dependencies and independencies among variables; Incorporates topological information; and Learns mutual information among genes.

Steps of learning gene causal relationships: mIC algorithm and its evaluation Step1: obtain the probability distribution, data sampling and the topological order of the gene; Step2: apply algorithms to find causal relations; Step3: compare the original and obtained networks based on the two notions of precision and recall; Step4: repeat step 1-3 for every random network;

We applied the learning algorithm in Melanoma Dataset melanoma -- malignant tumor occurring most commonly in skin;

Knowledge we have The 10 genes involved in this study chosen from 587 genes from the melanoma data; Previous studies show that WNT5A has been identified as a gene of interest involved in melanoma; Controlling the influence of WNT5A in the regulation can reduce the chance of melanoma metastasizing; Partial biological prior knowledge: MMP3 is expected to be at the end of the pathway

Important Information we discovered WNT5A Pirin causatively influences WNT5A – “In order to maintain the level of WNT5A we need to directly control WNT5A or through pirin”. Causal connection between WNT5A and MART-1 “WNT5A directly causes MART-1”

Modeling and simulation of a causal Boolean network (BN) Proper function: The function that reflects the influence of the operators. Example: c = f(a,b) = (ab) (a   b) = a is not a proper function. A C B f C=f(A,B) Simulation process in our study: Generate M BNs with up to 3 causal parents for each node; For each BN, generate a random proper function for each node; Assign random probabilities for the root gene(s); Given one configuration, get probability distribution; Collect 200 data points for each network; Repeat above steps 3-5 for all M networks.

Comparing original and obtained networks Original graph is a DAG, while obtained graph has both directed and undirected edges; Orig Graph Obt. Graph FN TP TN FP PFN, PTP PTN, PFP Please reference paper in page 5, the definition of precision and recall Recall = ATP/(AFN+ATP), Precision = ATP/(ATP + AFP)

Learning with IC algorithm

Learning with MIC algorithm

Conclusion Causality differs from correlation. P(X|Y) differs from P(X| do(Y)). While P(X|Y) can be answered using joint probability distributions and other representations of it (such as Bayes nets), to answer P(X|do(Y)) one needs a causal model. We have worked on various causal model representations and how to reason with them. Causal models can be learned by knock out experiments and from temporal and time series data. Recent algorithms have been proposed to learn causal models from steady state data. IC algorithm. We have improved on the IC algorithm.

References Judea Pearl: Reasoning with cause and effect. http://singapore.cs.ucla.edu/IJCAI99/index.html An algorithm to learn causal connection between genes from steady state data: simulation and its application to melanoma dataset. Xin Zhang, Chitta Baral, Seungchan Kim. Proc. of 10th Conference on Artificial Intelligence in Medicine (AIME 05) 23 - 27 July 2005 Aberdeen, Scotland. pages 524-534. http://www.public.asu.edu/~cbaral/papers/AIME05_final.pdf