Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
. Exact Inference in Bayesian Networks Lecture 9.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
“Using Weighted MAX-SAT Engines to Solve MPE” -- by James D. Park Shuo (Olivia) Yang.
Bucket Elimination: A Unifying Framework for Probabilistic Inference By: Rina Dechter Presented by: Gal Lavee.
Lauritzen-Spiegelhalter Algorithm
Bucket Elimination: A unifying framework for Probabilistic inference Rina Dechter presented by Anton Bezuglov, Hrishikesh Goradia CSCE 582 Fall02 Instructor:
Bayesian Networks Bucket Elimination Algorithm 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
CSCE 582 Computation of the Most Probable Explanation in Bayesian Networks using Bucket Elimination -Hareesh Lingareddy University of South Carolina.
Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graph.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
1 Exact Inference Algorithms for Probabilistic Reasoning; COMPSCI 276 Fall 2007.
. Hidden Markov Models For Genetic Linkage Analysis Lecture #4 Prepared by Dan Geiger.
December Marginal and Joint Beliefs in BN1 A Hybrid Algorithm to Compute Marginal and Joint Beliefs in Bayesian Networks and its complexity Mark.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
M. HardojoFriday, February 14, 2003 Directional Consistency Dechter, Chapter 4 1.Section 4.4: Width vs. Local Consistency Width-1 problems: DAC Width-2.
Tutorial #9 by Ma’ayan Fishelson
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.
Inference in Gaussian and Hybrid Bayesian Networks ICS 275B.
Probabilistic Reasoning
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Introduction to Bayesian Networks
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Arc Consistency CPSC 322 – CSP 3 Textbook § 4.5 February 2, 2011.
Inference Algorithms for Bayes Networks
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Tutorial #9 by Ma’ayan Fishelson. 2 Bucket Elimination Algorithm An algorithm for performing inference in a Bayesian network. Similar algorithms can.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Daphne Koller Overview Maximum a posteriori (MAP) Probabilistic Graphical Models Inference.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Inference in Bayesian Networks
Exact Inference Continued
CSCI 5822 Probabilistic Models of Human and Machine Learning
CAP 5636 – Advanced Artificial Intelligence
Exact Inference ..
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
Clique Tree Algorithm: Computation
Variable Elimination Graphical Models – Carlos Guestrin
presented by Anton Bezuglov, Hrishikesh Goradia CSCE 582 Fall02
Presentation transcript:

Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1

Motivation Hard & soft constraints are known with certainty How to model uncertainty? Probabilistic networks (also belief networks & Bayesian networks) handle uncertainty Not a ‘pure’ CSP but techniques (bucket elimination) can be adapted to work April 1&3, 2013DanielG--Probabilistic Networks2

Overview Background on probability Probabilistic networks defined Section 14 Belief assessment with bucket elimination Section 14.1 Most probable explanation with Section 14.2 bucket elimination Maximum a posteriori hypothesis [Dechter 96] Complexity Section 14.3 Hybrids of elimination and conditioning Section 14.4 Summary April 1&3, 2013DanielG--Probabilistic Networks3

Probability: Background Single variable probability: P(b) probability of b Joint probability: P(a,b) probability of a and b Conditional probability: P(a|b) probability of a given b April 1&3, 2013DanielG--Probabilistic Networks4

Chaining Conditional Probabilities A joint probability of any size may be broken into conditional probabilities April 1&3, 2013DanielG--Probabilistic Networks5

Graphical Representation Represented by a directed acyclic graph Edges are causal influence of one variable to another Direct influence: single edge Indirect influence: path length ≥ 2 April 1&3, 2013DanielG--Probabilistic Networks6 Section 14

Example P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 AP(B=0|A)P(B=1|A) w sp su f AP(C=0|A)P(C=1|A) w sp su f A: B:C: April 1&3, 2013DanielG--Probabilistic Networks7 Conditional Probability Table (CPT) AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f FP(G=0|F)P(G=1|F) D: BC P(F=0|B,C)P(F=1|B,C) F: G: Section 14

Belief Network Defined Set of random variables: Variables’ domains: Belief network: Directed acyclic graph: Conditional prob. tables: Evidence set:, subset of instantiated variables April 1&3, 2013DanielG--Probabilistic Networks8 Section 14

Belief Network Defined A belief network gives a probability distribution over all variables in X An assignment is abbreviated – is the restriction of to a subset of variables, S April 1&3, 2013DanielG--Probabilistic Networks9 Section 14

Example P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 AP(B=0|A)P(B=1|A) w sp su f AP(C=0|A)P(C=1|A) w sp su f A: B:C: April 1&3, 2013DanielG--Probabilistic Networks10 Conditional Probability Table (CPT) AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f FP(G=0|F)P(G=1|F) D: BC P(F=0|B,C)P(F=1|B,C) F: G: Section 14

Example P(A=sp,B=1,C=0,D=0,F=0,G=0) = P(A=sp) ∙ P(B=1|A=sp) ∙ P(C=0|A=sp) ∙ P(D=0|A=sp,B=1) ∙ P(F=0|B=1,C=0) ∙ P(G=0|F=0) =0.25 ∙ 0.1 ∙ 0.7 ∙ 1.0 ∙ 0.4 ∙ 1.0 = April 1&3, 2013DanielG--Probabilistic Networks11 Section 14

Probabilistic Network: Queries Belief assessment given a set of evidence, determine how probabilities of all other variables are affected Most probable explanation (MPE) given a set of evidence, find the most probable assignment to all other variables Maximum a posteriori hypothesis (MAP) assign a subset of unobserved hypothesis variables to maximize their conditional probability April 1&3, 2013DanielG--Probabilistic Networks12 Section 14

Belief Assessment: Bucket Elimination Belief Assessment Given a set of evidence, determine how probabilities of all other variables are affected – Evidence: Some possibilities are eliminated – Probabilities of unknowns can be updated Known as belief updating Solved by a modification of Bucket Elimination April 1&3, 2013DanielG--Probabilistic Networks13 Section 14.1

Derivation Similar to ELIM-OPT – Summation replaced with product – Maximization replaced by summation x=a is the proposition we are considering E=e is our evidence Compute April 1&3, 2013DanielG--Probabilistic Networks14 Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks15 Takes as input a belief network along with an ordering on the variables. All known variable values are also provided as “evidence” Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks16 Will output a matrix with probabilities for all values of x1 (the first variable in the given ordering) given the evidence. Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks17 Sets up the buckets, one for each variable. As with other bucket elimination algorithms, the matrices start in the last bucket and move up until they are “caught” by the first bucket which is a variable in its scope. Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks18 Go through all the buckets, last to first. Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks19 If a bucket contains a piece of the input evidence, ignore all probabilities not associated with that variable assignment Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks20 The scope of the generated matrix is the union of the scopes of the contained matrices and without the bucket variable, as it is projected out Consider all tuples of variables in the scopes and multiply their probabilities. When projecting out the bucket variable, sum the probabilities. Section 14.1

ELIM-BEL Algorithm April 1&3, 2013DanielG--Probabilistic Networks21 To arrive at the output desired, a normalizing constant must be applied to make all probabilities of all values of x1 sum to 1. Section 14.1

Example P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 AP(B=0|A)P(B=1|A) w sp su f AP(C=0|A)P(C=1|A) w sp su f A: B:C: April 1&3, 2013DanielG--Probabilistic Networks22 Conditional Probability Table (CPT) AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f FP(G=0|F)P(G=1|F) D: BC P(F=0|B,C)P(F=1|B,C) F: G: Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks23 A C B F D G g=1 P(g|f) d=1 P(d|b,a) P(f|b,c) P(b|a) P(c|a) P(a) λ G (f) λ D (b,a) λ F (b,c) λ B (a,c) λ C (a) Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks24 FP(G=0|F)P(G=1|F) G g=1 P(g|f) λ G (f) P(g|f) g=1 Fλ G (f) λ G (f) Section 14.1

AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f Example April 1&3, 2013DanielG--Probabilistic Networks25 d=1 D P(d|b,a) λ D (b,a) P(d|b,a)λ D (b,a) AB w sp su f w sp su f 1 Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks26 F P(f|b,c)λ G (f) λ F (b,c) BC P(F=0|B,C)P(F=1|B,C) P(f|b,c) Fλ G (f) λ G (f) BC F=0F=1λ F (b,c) λ F (b,c) Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks27 BC λ F (b,c) λ F (b,c) B P(b|a) λ D (b,a) λ F (b,c) λ B (a,c) AP(B=0|A)P(B=1|A) w sp su f P(b|a) λ D (b,a) AB w sp su f w sp su f 1 AC B=0B=1λ B (a,c) w sp su f 0 w 1 sp su f λ B (a,c) Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks28 AC λ B (a,c) w sp su f 0 w 1 sp su f λ B (a,c) C P(c|a) λ B (a,c) λ C (a) AP(C=0|A)P(C=1|A) w sp su f P(c|a) AC=0C=1λ C (a) w0.0 sp su f λ C (a) Section 14.1

Example April 1&3, 2013DanielG--Probabilistic Networks29 Aλ C (a) w0.0 sp su f λ C (a) A P(a)λ C (a) P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 P(a) AΠλ A (a) w0.0 sp su f λ A (a) Σ= Section 14.1

Derivation Evidence that g=1 Need to compute: Generate a function over G, April 1&3, 2013DanielG--Probabilistic Networks30 Section 14.1

Derivation Place as far left as possible: Generate. Place as far left as possible. Generate. April 1&3, 2013DanielG--Probabilistic Networks31 Section 14.1

Derivation Generate and place. Thus our final answer is April 1&3, 2013DanielG--Probabilistic Networks32 Section 14.1

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks33 As before, takes as input a belief network along with an ordering on the variables. All known variable values are also provided as “evidence”. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks34 The output will be the most probable configuration of the variables considering the given evidence. We will also have the probability of that configuration. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks35 Buckets are initialized as before. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks36 Iterate buckets from last to first. (Note that the functions are referred to by h rather than λ) Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks37 If a bucket contains evidence, ignore all assignments that go against that evidence. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks38 The scope of the generated function is the union of the scopes of the contained functions but without the bucket variable. The function is generated by multiplying corresponding entries in the contained matrices and then projecting out the bucket variable by taking the maximum probability. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks39 The probability of the MPE is returned when the final bucket is processed. Section 14.2

ELIM-MPE Algorithm April 1&3, 2013DanielG--Probabilistic Networks40 Return to all the buckets in the order d and assign the value that maximizes the probability returned by the generated functions. Section 14.2

Example P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 AP(B=0|A)P(B=1|A) w sp su f AP(C=0|A)P(C=1|A) w sp su f A: B:C: April 1&3, 2013DanielG--Probabilistic Networks41 Conditional Probability Table (CPT) AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f FP(G=0|F)P(G=1|F) D: BC P(F=0|B,C)P(F=1|B,C) F: G: Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks42 A C B F D G f=1 P(g|f) P(d|b,a) P(f|b,c) P(b|a) P(c|a) P(a) h G (f) h D (b,a) h F (b,c) h B (a,c) h C (a) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks43 FP(G=0|F)P(G=1|F) G P(g|f) h G (f) P(g|f) h G (f) FG=0G=1h G (f) Section 14.2

AB P(D=0|A,B)P(D=1|A,B) w sp su f w sp su f Example April 1&3, 2013DanielG--Probabilistic Networks44 D P(d|b,a) h D (b,a) P(d|b,a)h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks45 F P(f|b,c) h G (f) h F (b,c) BC P(F=0|B,C)P(F=1|B,C) P(f|b,c) BC F=0F=1h F (b,c) h F (b,c) f=1 h G (f) F f=1 Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks46 BC h F (b,c) h F (b,c) B P(b|a) h D (b,a) h F (b,c) h B (a,c) AP(B=0|A)P(B=1|A) w sp su f P(b|a) h D (b,a) AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) AB h D (b,a) w sp su f w sp su f 1 Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks47 h B (a,c) C P(c|a) h B (a,c) h C (a) AP(C=0|A)P(C=1|A) w sp su f P(c|a) AC=0C=1h C (a) w0.0 sp su f h C (a) AC h B (a,c) w sp su f w sp su f Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks48 h C (a) A P(a)h C (a) P(A=w)P(A=sp)P(A=su)P(A=f) 0.25 P(a) Ah A (a) w0.0 sp su0.012 f h A (a) max= Ah C (a) w0.0 sp su0.048 f Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks49 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks50 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks51 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp, C=1 AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks52 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp, C=1, B=0 AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks53 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp, C=1, B=0, F=1 AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks54 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp, C=1, B=0, F=1, D=0 AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

Example April 1&3, 2013DanielG--Probabilistic Networks55 h C (a) Ah A (a) w0.0 sp su0.012 f h A (a) MPE probability: A=sp, C=1, B=0, F=1, D=0, G=0/1 AC=0C=1h C (a) w0.0 sp su f AC B=0B=1h B (a,c) w sp su f w sp su f h B (a,c) h F (b,c) h D (b,a) AB D=0D=1h D (b,a) w sp su f w sp su f h G (f) FG=0G=1h G (f) BC F=0F=1h F (b,c) Section 14.2

MPE vs MAP MPE gives the most probable assignment to the entire set of variables given evidence MAP gives the most probable assignment to a subset of variables given evidence The assignments may differ April 1&3, 2013DanielG--Probabilistic Networks56 [Dechter 96] Paper: “Bucket elimination: A unifying framework for probabilistic inference”

MPE vs MAP April 1&3, 2013DanielG--Probabilistic Networks57 WXYZ P(w,x,y,z) Evidence: Z=0 WXYZ P(w,x,y,z) MPE: W=1, X=0, Y=1, Z=0 WX P(w,x,y,z) MAP for subset {W,X}: W=1, X=1 [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks58 Takes as input a probabilistic network, evidence (not mentioned), a subset of variables and an ordering in which those variables come first [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks59 Outputs the assignment to the given variable subset that has the highest probability. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks60 Initialize buckets as normal. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks61 Process buckets from last to first as normal. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks62 If the bucket contains a variable assignment from evidence, apply that assignment and generate the corresponding function. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks63 Else if the bucket variable is not a member of the subset, take the product of all contained function, then project out the bucket variable by summing over it. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks64 Else if the bucket variable is a member of the subset, take the product of all contained function, then project out the bucket variable by maximizing over it. [Dechter 96]

ELIM-MAP Algorithm April 1&3, 2013DanielG--Probabilistic Networks65 After all buckets have been processed, move in the forward direction and consult generated functions to obtain the most probable assignments to the subset. [Dechter 96]

Complexity With all bucket elimination, complexity dominated by time and space to process a bucket Time and space exponential in the number of variables in a bucket Induced width of the ordering bounds the scope of the generated functions April 1&3, 2013DanielG--Probabilistic Networks66 Section 14.3

Complexity: adjusted induced width Adjusted induced width of G relative to E along d: w * (d,E) is the induced width along ordering d when nodes of variables in E are removed. April 1&3, 2013DanielG--Probabilistic Networks67 B P(b|a) λ D (b,a) λ F (b,c) λ B (a,c) B=1 Section 14.3

Complexity: adjusted induced width Adjusted induced width of G relative to E along d: w * (d,E) is the induced width along ordering d when nodes of variables in E are removed. April 1&3, 2013DanielG--Probabilistic Networks68 B P(b|a) λ D (b,a) λ F (b,c) λ B1 (a) B=1 λ B2 (a) λ B3 (c) Section 14.3

Complexity: orderings April 1&3, 2013DanielG--Probabilistic Networks69 Belief network Moral graph A C B F D G A C B F D G w * (d 1,B=1)=2w * (d 2,B=1)=3 Section 14.3

Hybrids of Elimination and Conditioning Elimination algorithms require significant memory to store generated functions Search only takes linear space By combining these approaches the space complexity can be reduced and made manageable April 1&3, 2013DanielG--Probabilistic Networks70 Section 14.4

Full Search in Probabilistic Networks Traverse a search tree of variable assignments When a leaf is reached, calculate the joint probability of that combination of values Sum over values that are not of interest April 1&3, 2013DanielG--Probabilistic Networks71 Using search to find P(a, G=0, D=1) Section 14.4

Hybrid Search Take a subset of variables, Y, which we will search over All other variables will be handled with elimination First search for an assignment to variables in Y Treat these as evidence and then perform elimination as usual April 1&3, 2013DanielG--Probabilistic Networks72 Section 14.4

Hybrid Search April 1&3, 2013DanielG--Probabilistic Networks73 Hybrid search with static selection of set Y Hybrid search with dynamic selection of set Y Section 14.4

Hybrid Complexity Space: O(n∙exp(w* (d, Y U E))) Time: O(n∙exp(w* (d, Y U E)+|Y|)) If E U Y is a cycle-cutset of the moral graph, graph breaks into trees and the adjusted induced width may become 1 April 1&3, 2013DanielG--Probabilistic Networks74 Section 14.4

Summary Probabilistic networks are used to express problems with uncertainty Most common queries: – belief assessment – most probable explanation – maximum a posteriori hypothesis Bucket elimination can handle all three queries Hybrid of search and elimination can cut down on space requirement April 1&3, 2013DanielG--Probabilistic Networks75

Questions? April 1&3, 2013DanielG--Probabilistic Networks76