UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Causal Discovery from Medical Textual Data Subramani Mani and Gregory F. Cooper.

Slides:

Advertisements

Similar presentations

Bayesian network for gene regulatory network construction

Advertisements

1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.

An Adaptive System for User Information needs based on the observed meta- Knowledge AKERELE Olubunmi Doctorate student, University of Ibadan, Ibadan, Nigeria;

1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,

A Tutorial on Learning with Bayesian Networks

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.

Introduction of Probabilistic Reasoning and Bayesian Networks

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering An Introduction to Pearl’s Do-Calculus of Intervention Marco Valtorta Department.

Causal and Bayesian Network (Chapter 2) Book: Bayesian Networks and Decision Graphs Author: Finn V. Jensen, Thomas D. Nielsen CSE 655 Probabilistic Reasoning.

Using Markov Blankets for Causal Structure Learning Jean-Philippe Pellet Andre Ellisseeff Presented by Na Dai.

From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University.

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.5 [P]: Propositions and Inference Sections.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering On-line Alert Systems for Production Plants A Conflict Based Approach.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6 [P]: Reasoning Under Uncertainty Sections.

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

Example applications of Bayesian networks

1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Bayes-ball—an Efficient Algorithm to Assess D-separation A Presentation for.

Computer vision: models, learning and inference Chapter 10 Graphical Models.

Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.

Aspects of Bayesian Inference and Statistical Disclosure Control in Python Duncan Smith Confidentiality and Privacy Group CCSR University of Manchester.

Bayes Net Perspectives on Causation and Causal Inference

Data Mining Techniques

Multiple testing correction

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Biswanath Panda, Mirek Riedewald, Daniel Fink ICDE Conference 2010 The Model-Summary Problem and a Solution for Trees 1.

Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

Introduction to Bayesian Networks

Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.

Book: Bayesian Networks : A practical guide to applications Paper-authors: Luis M. de Campos, Juan M. Fernandez-Luna, Juan F. Huete, Carlos Martine, Alfonso.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.

BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™

Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.

Bayesian Nets and Applications. Naïve Bayes 2  What happens if we have more than one piece of evidence?  If we can assume conditional independence 

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.

Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):

K2 Algorithm Presentation KDD Lab, CIS Department, KSU

Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표 : 황규백.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.

Bayesian Networks in Document Clustering Slawomir Wierzchon, Mieczyslaw Klopotek Michal Draminski Krzysztof Ciesielski Mariusz Kujawiak Institute of Computer.

Introduction on Graphic Models

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.

Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.

A fault tree – Based Bayesian network construction for the failure rate assessment of a complex system 46th ESReDA Seminar May 29-30, 2014, Politecnico.

Reconciling Confidentiality Risk Measures from Statistics and Computer Science Jerry Reiter Department of Statistical Science Duke University.

Bayesian Nets and Applications Next class: machine learning C. 18.1, 18.2 Homework due next class Questions on the homework? Prof. McKeown will not hold.

Learning Bayesian Network Models from Data

Inconsistent Constraints

A Bayesian Approach to Learning Causal networks

A Short Tutorial on Causal Network Modeling and Discovery

Bayesian Networks Independencies Representation Probabilistic

NRES 746: Laura Cirillo, Cortney Hulse, Rosie Perash

Markov Random Fields Presented by: Vladan Radosavljevic.

Graduate School of Information Sciences, Tohoku University

CS 188: Artificial Intelligence Spring 2006

Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,

Presentation transcript:

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Causal Discovery from Medical Textual Data Subramani Mani and Gregory F. Cooper Proceedings of the AMIA annual fall symposium, 2000, pp Hanley and Belfus Publishers, Philadelphia, PA. Available at: Presentation for the Bayesian Networks Reading Club Marco Valtorta January 28, 2005

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Learning from Textual Data Text is ubiquitous Causal knowledge aids in planning and decision making –Because it supports manipulation Causal relations may represent prior and tacit knowledge Learning from textual data is a new and difficult area of research “[T]he present paper reports the first investigation of causal knowledge discovery from text.”

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Causal BNs A causal BN is a BN in which each arc is interpreted as a direct causal influence between parent and child node Casual (why?) Markov condition: a variable is independent of its non-descendants given its parents Causal (why?) faithfulness condition: variables are independent only if their independence is implied by the causal Markov condition Statistical testing assumption: independence tests on a finite dataset are correct with respect to the underlying causal process that generated the dataset

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering LCD Algorithm Algorithm for finding causal links between pairs of variables Assumes Markov condition, faithfulness condition, and statistical testing assumption, and an additional assumption

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Assumption 4 Given measured variables X,Y, and Z, if Y causes Z, and Y and Z are not confounded (i.e., they do not have a common unmeasured cause), then one of the causal networks below must hold: In case (1), X and Y are independent; in case (2), they are dependent due to X causing Y; and in cases (3) and (4), they are confounded

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Local Causal Discovery Consider three measured variables W,X, and Y. We will model the ways in which each pair can be causally related as in the tables above. The H variables are unmeasured (latent, hidden) variables. There are 96 ways in which W,X, and Y can interact. This is not a complete list, but Cooper argues that very little is lost. Based on: Cooper, Gregory F. “A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships.” Data Mining and Knowledge Discovery 1, (1997).

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Three Tests D(W, X) or: Dependent(X,Y) D(X,Y) or: Dependent(Y,Z) I(W,Y|X) or: Independent(X, Z given Y)

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering D-Separation Conditions for the 96 Causal Graphs Graphs 18, 19, and 20 are the only ones for which all three tests D(W,X), D(X,Y), and I(W,Y|X) hold. In each of the three graphs, X causes Y.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering The three Graphs for Which All Tests Are Positive

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Example: No Causal Link There is no causal link between Y causes Z. Independent (X, Z given Y) fails.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering LCD Pseudocode

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Extra Test D(W,Y) Dependent(X,Y) Why? Redundancy, if I understand Cooper correctly.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Limitations of LCD Many causal networks are missed, e.g: LCD only returns separate pairwise causal relationships, which may need to be assembled.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Time Complexity Not too bad, because only three variables at a time are considered O(mnnr), where m is the number of cases in the database, n is the number of variables, and r is the number of variables such as X (W in Cooper’s paper), i.e., variables that have no cause (“acausal”) –O(mn) if “few” acausal variables and potential effects (like Z, or Y in Cooper’s paper) Space complexity: O(mn), which is the size of the database

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Text Dataset 2060 ICU discharge summaries (documents) 1808 unique words appeared in the documents Age, gender, and race appeared in 1611 documents and were considered causeless (“acausal”) Each of the 1808 words was coded as present (1) or absent (0): in total 1811 variables m=1611, n=1811, r=3 only 18 variables of type Z (possible effects) were considered: nausea, cyrrhosis, dyspnea,…

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Results One good causal relation was recovered One bad causal relation was also obtained A study from infant birth and death records led to more causal relations. Ref.: Subramani Mani, Gregory F. Cooper “A Study in Causal Discovery from Population-Based Infant Birth and Death Records.” Proceedings of the AMIA Annual Fall Symposium, 1999, p Hanley and Belfus Publishers, Philadelphia, PA.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Suggested Improvements Multi-word phrases More records Multivariate causes or effects (?) Encoding variable-value pairs (e.g.: serum sodium = high) (?) The number of occurrences of phrases in a documents The location of phrases in a document

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Limitations of the Text Study Words, not phrases Present or absent only Context of a word is not considered “hypertensive” and “not hypertensive” are treated the same way! Synonyms are treated as different words More generally: no linguistic analysis, no domain (semantic, ontological?) information is used