For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
1 Bayes nets Computing conditional probability Polytrees Probability Inferences Bayes nets Computing conditional probability Polytrees Probability Inferences.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
Dynamic Bayesian Networks (DBNs)
Reasoning under Uncertainty Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
M.I. Jaime Alfonso Reyes ´Cortés.  The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
10/24  Exam on 10/26 (Lei Tang and Will Cushing to proctor)
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.
Summary Belief Networks zObjective: yProbabilistic knowledge base + Inference engine that computes xProb(formula | “all evidence collected so far”) zBelief.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 14 Jim Martin.
Bayesian Belief Networks
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
Example applications of Bayesian networks
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.
Bayesian Networks Alan Ritter.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
For Monday after Spring Break Read Homework: –Chapter 13, exercise 6 and 8 May be done in pairs.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Bayesian networks Chapter 14. Outline Syntax Semantics.
1 CS 343: Artificial Intelligence Bayesian Networks Raymond J. Mooney University of Texas at Austin.
1 CS 391L: Machine Learning: Bayesian Learning: Beyond Naïve Bayes Raymond J. Mooney University of Texas at Austin.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Introduction to Bayesian Networks
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Announcements Project 4: Ghostbusters Homework 7
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Inference Algorithms for Bayes Networks
For Monday Make sure you have read chapter 11 through section 3. No homework.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.
Today Graphical Models Representing conditional dependence graphically
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Artificial Intelligence Chapter 19 Reasoning with Uncertain Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
1 Chapter 6 Bayesian Learning lecture slides of Raymond J. Mooney, University of Texas at Austin.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
CS 2750: Machine Learning Directed Graphical Models
Qian Liu CSE spring University of Pennsylvania
CSCI 121 Special Topics: Bayesian Networks Lecture #2: Bayes Nets
For Friday Read Chapter 11, section 3
CAP 5636 – Advanced Artificial Intelligence
Bayesian Statistics and Belief Networks
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo
Presentation transcript:

For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d

Program 3 Any questions?

Noisy-Or Nodes To avoid specifying the complete CPT, special nodes that make assumptions about the style of interaction can be used. A noisy­or node assumes that the parents are independent causes that are noisy, i.e. there is some probability that they will not cause the effect. The noise parameter for each cause indicates the probability it will not cause the effect. Probability that the effect is not present is the product of the noise parameters of all the parent nodes that are true (since independence is assumed). P(Fever|Cold) =0.4,P(Fever|Flu) =0.8,P(Fever| Malaria)=0.9 P(Fever | Cold  Flu  ¬Malaria) = * 0.2 = 0.88 Number of parameters needed is linear in fan­in rather than exponential.

Independencies If removing a subset of nodes S from the network renders nodes X i and X j disconnected, then X i and X j are independent given S, i.e. P(X i | X j, S) = P(X i | S) However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if there simply exists some variable that depends on both. (i.e. Burglary and Earthquake should be considered independent since the both cause Alarm)

Unless we know something about a common effect of two “independent causes” or a descendent of a common effect, then they can be considered independent. For example,if we know nothing else, Earthquake and Burglary are independent. However, if we have information about a common effect (or descendent thereof) then the two “independent” causes become probabilistically linked since evidence for one cause can “explain away” the other. If we know the alarm went off, then it makes earthquake and burglary dependent since evidence for earthquake decreases belief in burglary and vice versa.

Types of Connections Given a triplet of variables x, y, z where x is connected to z via y, there are 3 possible connection types: –tail­to­tail: x  y  z –head­to­tail: x  y  z, or x  y  z –head­to­head: x  y  z For tail­to­tail and head­to­tail connections, x and z are independent given y. For head­to­head connections, x and z are “marginally independent” but may become dependent given the value of y or one of its descendents (through “explaining away”).

Separation A subset of variables S is said to separate X from Y if all (undirected) paths between X and Y are separated by S. A path P is separated by a subset of variables S if at least one pair of successive links along P is blocked by S. Two links meeting head­to­tail or tail­to­tail at a node Z are blocked by S if Z is in S. Two links meeting head­to­head at a node Z are blocked by S if neither Z nor any of its descendants are in S.

Probabilistic Inference Given known values for some evidence variables, we want to determine the posterior probability of of some query variables. Example: Given that John calls, what is the probability that there is a Burglary? John calls 90% of the time there is a burglary and the alarm detects 94% of burglaries, so people generally think it should be fairly high (80­90%). But this ignores the prior probability of John calling. John also calls 5% of the time when there is no alarm. So over the course of 1,000 days we expect one burglary and John will probably call. But John will also call with a false report 50 times during 1,000 days on average. So the call is about 50 times more likely to be a false report P(Burglary | JohnCalls) ~ Actual probability is since the alarm is not perfect (an earthquake could have set it off or it could have just went off on its own). Of course even if there was no alarm and John called incorrectly, there could have been an undetected burglary anyway, but this is very unlikely.

Types of Inference Diagnostic (evidential, abductive): From effect to cause. P(Burglary | JohnCalls) = P(Burglary | JohnCalls  MaryCalls) = 0.29 P(Alarm | JohnCalls  MaryCalls) = 0.76 P(Earthquake | JohnCalls  MaryCalls) = 0.18 Causal (predictive): From cause to effect P(JohnCalls | Burglary) = 0.86 P(MaryCalls | Burglary) = 0.67

More Types of Inference Intercausal (explaining away): Between causes of a common effect. P(Burglary | Alarm) = P(Burglary | Alarm  Earthquake) = Mixed: Two or more of the above combined (diagnostic and causal) P(Alarm | JohnCalls  ¬Earthquake) = 0.03 (diagnostic and intercausal) P(Burglary | JohnCalls  ¬Earthquake) = 0.017

Inference Algorithms Most inference algorithms for Bayes nets are not goal­directed and calculate posterior probabilities for all other variables. In general, the problem of Bayes net inference is NP­hard (exponential in the size of the graph).

Polytree Inference For singly­connected networks or polytrees, in which there are no undirected loops (there is at most one undirected path between any two nodes), polynomial (linear) time algorithms are known. Details of inference algorithms are somewhat mathematically complex, but algorithms for polytrees are structurally quite simple and employ simple propagation of values through the graph.

Belief Propagation Belief propogation and updating involves transmitting two types of messages between neighboring nodes: – messages are sent from children to parents and involve the strength of evidential support for a node. –  messages are sent from parents to children and involve the strength of causal support.

Propagation Details Each node B acts as a simple processor which maintains a vector (B) for the total evidential support for each value of the corresponding variable and an analagous vector  (B) for the total causal support. The belief vector BEL(B) for a node, which maintains the probability for each value, is calculated as the normalized product: BEL(B) =  (B)  (B)

Propogation Details (cont.) Computation at each node involve and  message vectors sent between nodes and consists of simple matrix calculations using the CPT to update belief (the and  node vectors) for each node based on new evidence. Assumes CPT for each node is a matrix (M) with a column for each value of the variable and a row for each conditioning case (all rows must sum to 1).

Basic Solution Approaches Clustering: Merge nodes to eliminate loops. Cutset Conditioning: Create several trees for each possible condition of a set of nodes that break all loops. Stochastic simulation: Approximate posterior proabilities by running repeated random trials testing various conditions.

Applications of Bayes Nets Medical diagnosis (Pathfinder, outperforms leading experts in diagnosis of lymph­node diseases) Device diagnosis (Diagnosis of printer problems in Microsoft Windows) Information retrieval (Prediction of relevant documents) Computer vision (Object recognition)