1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8

2 Today’s class Conditional independence Bayesian networks –Network structure –Conditional probability tables –Conditional independence –Inference in Bayesian networks

3 Conditional Independence and Bayesian Networks Chapters 13/14 (2/e)

4 Conditional independence Last time we talked about absolute independence: –A and B are independent if P(A  B) = P(A) P(B); equivalently, P(A) = P(A | B) and P(B) = P(B | A) A and B are conditionally independent given C if –P(A  B | C) = P(A | C) P(B | C) This lets us decompose the joint distribution: –P(A  B  C) = P(A | C) P(B | C) P(C) In the example from last time, Moon-Phase and Burglary are conditionally independent given Light-Level Conditional independence is weaker than absolute independence, but still useful in decomposing the full joint probability distribution

5 Bayesian Belief Networks (BNs) Definition: BN = (DAG, CPD) –DAG: directed acyclic graph (BN’s structure) Nodes: random variables (typically binary or discrete, but methods also exist to handle continuous variables) Arcs: indicate probabilistic dependencies between nodes (lack of link signifies conditional independence) –CPD: conditional probability distribution (BN’s parameters) Conditional probabilities at each node, usually stored as a table (conditional probability table, or CPT) –Root nodes are a special case – no parents, so just use priors in CPD:

7 Topological semantics A node is conditionally independent of its non- descendants given its parents A node is conditionally independent of all other nodes in the network given its parents, children, and children’s parents (also known as its Markov blanket) The method called d-separation can be applied to decide whether a set of nodes X is independent of another set Y, given a third set Z

8 Independence assumption – where q is any set of variables (nodes) other than and its successors – blocks influence of other nodes on and its successors (q influences only through variables in ) –With this assumption, the complete joint probability distribution of all variables in the network can be represented by (recovered from) local CPD by chaining these CPD q Independence and chaining

10 Direct inference with BNs Now suppose we just want the probability for one variable Belief update method Original belief (no variables are instantiated): Use prior probability p(x i ) If x i is a root, then P(x i ) is given directly in the BN (CPT at X i ) Otherwise, –P(x i ) = Σ πi P(x i | π i ) P(π i ) In this equation, P(x i | π i ) is given in the CPT, but computing P(π i ) is complicated

11 Computing π i : Example P (d) = P(d | b, c) P(b, c) P(b, c) = P(a, b, c) + P(¬a, b, c) (marginalizing) = P(b | a, c) p (a, c) + p(b | ¬a, c) p(¬a, c) (product rule) = P(b | a) P(c | a) P(a) + P(b | ¬a) P(c | ¬a) P(¬a) If some variables are instantiated, can “plug that in” and reduce amount of marginalization Still have to marginalize over all values of uninstantiated parents – not computationally feasible with large networks a b c d e

12 Representational extensions Compactly representing CPTs –Noisy-OR –Noisy-MAX Adding continuous variables –Discretization –Use density functions (usually mixtures of Gaussians) to build hybrid Bayesian networks (with discrete and continuous variables)

13 Inference tasks Simple queries: Computer posterior marginal P(X i | E=e) –E.g., P(NoGas | Gauge=empty, Lights=on, Starts=false) Conjunctive queries: –P(X i, X j | E=e) = P(X i | e=e) P(X j | X i, E=e) Optimal decisions: Decision networks include utility information; probabilistic inference is required to find P(outcome | action, evidence) Value of information: Which evidence should we seek next? Sensitivity analysis: Which probability values are most critical? Explanation: Why do I need a new starter motor?

14 Approaches to inference Exact inference –Enumeration –Variable elimination –Clustering / join tree algorithms Approximate inference –Stochastic simulation / sampling methods –Markov chain Monte Carlo methods –Genetic algorithms –Neural networks –Simulated annealing –Mean field theory

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Similar presentations

Presentation on theme: "1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Similar presentations

Presentation on theme: "1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8."— Presentation transcript:

Similar presentations

About project

Feedback