Propagation Algorithm in Bayesian Networks

Slides:



Advertisements
Similar presentations
Farrokh Alemi, Ph.D. Jee Vang
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Introduction of Probabilistic Reasoning and Bayesian Networks
Introduction to probability theory and graphical models Translational Neuroimaging Seminar on Bayesian Inference Spring 2013 Jakob Heinzle Translational.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Introduction to Decision Analysis
Bayes Nets Rong Jin. Hidden Markov Model  Inferring from observations (o i ) to hidden variables (q i )  This is a general framework for representing.
Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.
Aspects of Bayesian Inference and Statistical Disclosure Control in Python Duncan Smith Confidentiality and Privacy Group CCSR University of Manchester.
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
MGS3100_07.ppt/Apr 7, 2016/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Decision Tree & Bayes’ Theorem Apr 7, 2016.
Chapter 4 Some basic Probability Concepts 1-1. Learning Objectives  To learn the concept of the sample space associated with a random experiment.  To.
Stochasticity and Probability. A new approach to insight Pose question and think of the answer needed to answer it. Ask: How do the data arise? What is.
CS 2750: Machine Learning Directed Graphical Models
Lecture on Bayesian Belief Networks (Basics)
Today.
The binomial applied: absolute and relative risks, chi-square
Read R&N Ch Next lecture: Read R&N
Bayesian Networks: A Tutorial
Chapter 5 Sampling Distributions
Learning Bayesian Network Models from Data
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
CS 4/527: Artificial Intelligence
Data Mining Lecture 11.
Machine Learning. k-Nearest Neighbor Classifiers.
Normal Distribution Farrokh Alemi Ph.D.
Log Linear Modeling of Independence
Introduction to Summary Statistics
Belief Propagation: An Extremely Rudimentary Discussion
Probability Calculus Farrokh Alemi Ph.D.
Read R&N Ch Next lecture: Read R&N
Hidden Markov Models Part 2: Algorithms
The normal distribution
Introduction to Summary Statistics
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Causal Networks Farrokh Alemi, PhD.
Inferential Statistics
Chapter 5 Sampling Distributions
Saturday, August 06, 2016 Farrokh Alemi, PhD.
Test of Independence in 3 Variables
Statistical NLP: Lecture 4
Pattern Recognition and Image Analysis
Comparing two Rates Farrokh Alemi Ph.D.
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2007
CS 188: Artificial Intelligence Fall 2008
Chapter 5 Sampling Distributions
Wednesday, September 21, 2016 Farrokh Alemi, PhD.
Selecting the Right Predictors
Benchmarking Clinicians using Data Balancing
Propagation Algorithm in Bayesian Networks
Regression Assumptions
P-Chart Farrokh Alemi, Ph.D. This lecture was organized by Dr. Alemi.
Improving Overlap Farrokh Alemi, Ph.D.
Read R&N Ch Next lecture: Read R&N
Read R&N Ch Next lecture: Read R&N
Benchmarking Clinicians using Data Balancing
Risk Adjusted P-chart Farrokh Alemi, Ph.D.
Regression Assumptions
Wednesday, October 05, 2016 Farrokh Alemi, PhD.
Presentation transcript:

Propagation Algorithm in Bayesian Networks Saturday, August 20, 2016 Farrokh Alemi, PhD. This lecture shows how to predict an event given a Bayesian network and its specifications. The lecture describes Bayes belief network propagation and based in part on the information in https://www.youtube.com/watch?v=oYKAfYFmsoM and in http://www.cse.unsw.edu.au/~cs9417ml/Bayes/Pages/PearlPropagation.html

Bayesian Network P1 P2 Pm X C1 C2 Cn A Bayesian network is directed acyclical tree. Each node in the tree is related to its parents and children by arrows. Here we show a simple directed graph with three types of nodes. For the node X, all P nodes are parents and C nodes are children. There is always a directed arc from the parent node to X and from X to its children. This is called a directed graph because not only any two nodes may be associated with each other but also there is always a direction of influence. The parent node P influences node X and node X influences the child node C in the graph. All the relationships in the graph start at one node and end at another node. C1 C2 Cn

Bayesian Network P1 P2 Pm X C1 C2 Cn There are no two nodes without a direction of influence. In directed graphs we cannot leave the nodes X and C1 as unconnected. The node P2 and X are associated but have no direction of influence. This too is not allowed. C1 C2 Cn

Bayesian Network P1 P2 Pm X C1 C2 Cn In acyclical graph, it is not possible to start at one point, follow the directions in the graph, and end at the same place. The blue arrow between C1 and P1 makes a cycle in this graph and this is not allowed in acyclical graphs. C1 C2 Cn

Bayesian Network P1 P2 Pm X C1 C2 Cn The fact that we work with directed acyclical graphs limits the applications of our tools but simplifies the mathematics of what we need to take into account. C1 C2 Cn

Fidelity In a Bayesian network, the joint probability of all events can be read directly from the structure of the graph. For each node, one identifies the parents of the node from the graph and the joint probability of the nodes follow from the parents. Move from a graph to independence assumptions then to equation without loss of information

Formula & Graph Correspondence S: Severity of Illness R: Do Not Resuscitate T: Treatment O: Outcome For example, here we have a possible network that relates patients severity to clinicians choice of treatment and outcomes. In this graph Severity is a parent to patient preferences on resuscitation, treatment choice and outcome. The parents of treatment node are severity, resuscitation and provider’s decision. The parent’s of outcome’s are the treatment received and the severity of illness. Each of these parent child relationships indicate a dependency in the data. Perhaps, more important, the absence of any link indicates independency. M: Provider’s Decision

Formula & Graph Correspondence S: Severity of Illness R: Do Not Resuscitate T: Treatment O: Outcome M: Provider’s Decision Formula & Graph Correspondence For example, the link between provider’s decision and health outcomes is not present. The model assumes that the provider influences health outcomes primarily through the choice of treatment. Once the treatment is known, the provider’s choice is irrelevant. Both the lines drawn in the Bayes net and the lines not drawn informs us of interdependencies among the events. Furthermore, these interdependencies allows us to estimate the joint probability distribution.

M: Provider’s Decision Formula? S: Severity of Illness R: Do Not Resuscitate T: Treatment O: Outcome Let us see if we can write the equation for estimating the joint probability of the events in the Bayes net. In general, we know that only the parents of the node matter. We have abbreviated outcome by O, severity by S, treatment by T, physician’s decision by M, and patient’s preferences for not resuscitation by R. For simplicity we are assuming these are all binary events, in other words there are only two options: either to give treatment or not. There are only two outcomes: patient lives longer or patient dies, and so on. Now we can see what is the equivalent statement about this graph in terms of equations. M: Provider’s Decision

Probability of an Event Let us start at the end node, which is patient outcomes. This is called an end node as it has no children. The equation for this node is given conditional to the probabilities of its parents, non parents are not relevant.

Probability of an Event Now we can also calculate the probability of treatment which is the product of conditional probabilities of treatment given its parents. Note that the rest of the tree is irrelevant to this calculation.

Probability of an Event Finally, we need to calculate the probability of severity, do not resuscitate and physician’s choice of treatment. These events have no parents and their probabilities are merely the marginal probabilities

Probability of an Event Now we can put all five calculated probabilities together to estimate the joint probability of all events in the tree. The availability of the tree structure and its imbedded assumptions of independence has radically simplified what data we need and how we can calculate the joint distribution of the event.

What If Once we have the joint distribution of the events, we can use it to calculate conditional probabilities of what is the probability of events given that other events have already occurred. This is a relatively simple operation. The conditional probability is the probability of the event divided by probability of all events. So for probability of a particular outcome for a patient that is severely ill, we divide the joint probability of outcomes for severe patients by probability of observing severe patients. The way you can think of conditional probability is that we have selected all patients who are severely ill and within these patients we are looking at frequency of observing various outcomes.

Sum Out Missing Variables What If Sum Out Missing Variables Note that in this calculation the joint distribution of two variables, outcomes, and severity is needed but earlier we had calculated the joint distribution of all five variables. To move from the joint distribution of all 5 variables to fewer variables, we have to sum out the missing variables and calculate marginal tables. In this case, treatment, physician decision and resuscitation preferences are missing in the joint distribution.

Calculate from Marginal Tables after Removing Missing Variables What If Conditional questions are asked often. Bayes nets provides a useful way to calculate these data. For example, information on comparative value of treatment may be asked. Here we are looking at what are the likely outcomes for severe patients who received a particular treatment. If the joint distribution is known then conditional probabilities could be easily calculated. As before, note that in this calculation the joint distribution of three variables, outcomes, severity, and treatment is needed. Bayes calculations provide the joint distribution of all five variables. To move from the joint distribution of all 5 variables to fewer variables, we have to sum out the missing variables and calculate marginal tables. Calculate from Marginal Tables after Removing Missing Variables

Independence & Graph Structure The following shows a number of graph structures and their corresponding independence structure

Independence & Graph Structure The following shows a number of graph structures and their corresponding independence structure

Independence & Graph Structure The following shows a number of graph structures and their corresponding independence structure