. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
. Exact Inference in Bayesian Networks Lecture 9.
Theory of Computing Lecture 18 MAS 714 Hartmut Klauck.
COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.
Lecture 23. Subset Sum is NPC
Lecture 22: April 18 Probabilistic Method. Why Randomness? Probabilistic method: Proving the existence of an object satisfying certain properties without.
NP-complete and NP-hard problems Transitivity of polynomial-time many-one reductions Concept of Completeness and hardness for a complexity class Definition.
Dynamic Bayesian Networks (DBNs)
Bayesian Networks. Introduction A problem domain is modeled by a list of variables X 1, …, X n Knowledge about the problem domain is represented by a.
Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University.
From Variable Elimination to Junction Trees
Computability and Complexity 13-1 Computability and Complexity Andrei Bulatov The Class NP.
Bayesian network inference
Hardness Results for Problems P: Class of “easy to solve” problems Absolute hardness results Relative hardness results –Reduction technique.
PGM 2003/04 Tirgul 3-4 The Bayesian Network Representation.
Submitted by : Estrella Eisenberg Yair Kaufman Ohad Lipsky Riva Gonen Shalom.
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
1 Discrete Structures CS 280 Example application of probability: MAX 3-SAT.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
PGM 2002/03 Tirgul5 Clique/Junction Tree Inference.
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
Hardness Results for Problems
1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.
. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
Inference for a Single Population Proportion (p).
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
MCS 312: NP Completeness and Approximation algorithms Instructor Neelima Gupta
Pricing Combinatorial Markets for Tournaments Presented by Rory Kulz.
February 18, 2015CS21 Lecture 181 CS21 Decidability and Tractability Lecture 18 February 18, 2015.
Theory of Computation, Feodor F. Dragan, Kent State University 1 NP-Completeness P: is the set of decision problems (or languages) that are solvable in.
NP Complexity By Mussie Araya. What is NP Complexity? Formal Definition: NP is the set of decision problems solvable in polynomial time by a non- deterministic.
MCS 312: NP Completeness and Approximation algorthms Instructor Neelima Gupta
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.
Techniques for Proving NP-Completeness Show that a special case of the problem you are interested in is NP- complete. For example: The problem of finding.
1 COROLLARY 4: D is an I-map of P iff each variable X is conditionally independent in P of all its non-descendants, given its parents. Proof  : Each variable.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well.
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
NP-completeness Section 7.4 Giorgi Japaridze Theory of Computability.
CSE 589 Part V One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important. Bertrand Russell.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 29.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
Donghyun (David) Kim Department of Mathematics and Computer Science North Carolina Central University 1 Chapter 7 Time Complexity Some slides are in courtesy.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
NPC.
CSC 413/513: Intro to Algorithms
Complexity ©D.Moshkovits 1 2-Satisfiability NOTE: These slides were created by Muli Safra, from OPICS/sat/)
. Bayesian Networks Some slides have been edited from Nir Friedman’s lectures which is available at Changes made by Dan Geiger.
Computability Examples. Reducibility. NP completeness. Homework: Find other examples of NP complete problems.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
CSE 332: NP Completeness, Part II Richard Anderson Spring 2016.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
P & NP.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Richard Anderson Lecture 26 NP-Completeness
Approximate Inference
Learning Bayesian Network Models from Data
Bayesian Networks Background Readings: An Introduction to Bayesian Networks, Finn Jensen, UCL Press, Some slides have been edited from Nir Friedman’s.
Bell & Coins Example Coin1 Bell Coin2
Bayesian Networks (Directed Acyclic Graphical Models)
Professor Marie desJardins,
Inference III: Approximate Inference
Presentation transcript:

. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.

2 Bayesian Network p(t|v) Bayesian network = Directed Acyclic Graph (DAG), annotated with conditional probability distributions. V S L T A B XD p(x|a) p(d|a,b) p(a|t,l) p(b|s) p(l|s) p(s)p(s) p(v)p(v)

3 Bayesian Network (cont.) Each Directed Acyclic Graph defines a factorization of the form: p(t|v) V S L T A B XD p(x|a) p(d|a,b) p(a|t,l) p(b|s) p(l|s) p(s)p(s) p(v)p(v)

4 Local distributions Conditional Probability Table: p(A=y|L=n, T=n) = 0.02 p(A=y|L=n, T=y) = 0.60 p(A=y|L=y, T=n) = 0.99 p(A=y|L=y, T=y) = 0.99 Lung Cancer (Yes/No) Tuberculosis (Yes/No) Abnormality in Chest (Yes/no) p(A|T,L)

5 “Example” This model depicts the qualitative relations between the variables. We will now specify the joint distribution over these variables.

6 The “Visit-to-Asia” Example Visit to Asia Smoking Lung Cancer Tuberculosis Abnormality in Chest Bronchitis X-Ray Dyspnea

7 Queries There are many types of queries. Most queries involve evidence An evidence e is an assignment of values to a set E of variables in the domain P(Dyspnea = Yes | Visit_to_Asia = Yes, Smoking=Yes) P(Smoking=Yes | Dyspnea = Yes ) V S L T A B XD

8 Queries: A posteriori belief The conditional probability of a variable given the evidence This is the a posteriori belief in x, given evidence e Often we compute the term P(x, e) from which we can recover the a posteriori belief by Examples given in previous slide.

9 A posteriori belief This query is useful in many other cases: u Prediction: what is the probability of an outcome given the starting condition l Target is a descendent of the evidence (e.g., Does a visit to Asia lead to Tuberculosis ?) u Diagnosis: what is the probability of disease/fault given symptoms l Target is an ancestor of the evidence (e.g., Does the X- ray results indicate higher probability of Tuberculosis ?) V S L T A B XD

10 Example: Predictive+Diagnostic P(T = Yes | Visit_to_Asia = Yes, Dyspnea = Yes ) V S L T A B XD Probabilistic inference can combine evidence form all parts of the network, Diagnostic and Predictive, regardless of the directions of edges in the model.

11 Queries: MAP  Find the maximum a posteriori assignment for some variable of interest (say H 1,…,H l )  That is, h 1,…,h l maximize the conditional probability P(h 1,…,h l | e)  Equivalent to maximizing the joint P(h 1,…,h l, e)

12 Queries: MAP We can use MAP for: u Explanation l What is the most likely joint event, given the evidence (e.g., a set of likely diseases given the symptoms) l What is the most likely scenario, given the evidence (e.g., a series of likely malfunctions that trigger a fault). D1 D2 S2 S1 D3 D4 S4 S3 Dead battery Not charging Bad battery Bad magneto Bad alternator

13 Complexity of Inference Thm: Computing P(X = x) in a Bayesian network is NP- hard Not surprising, since we can simulate Boolean gates.

14 Proof We reduce 3-SAT to Bayesian network computation Assume we are given a 3-SAT problem:  Q 1,…,Q n be propositions,   1,...,  k be clauses, such that  i = l i1  l i2  l i3 where each l ij is a literal over Q 1,…,Q n (e.g., Q 1 = true ) u  =  1 ...  k We will construct a Bayesian network s.t. P(X=t) > 0 iff  is satisfiable

15...  P(Q i = true) = 0.5,  P(  I = true | Q i, Q j, Q l ) = 1 iff Q i, Q j, Q l satisfy the clause  I  A 1, A 2, …, are simple binary AND gates... 11 Q1Q1 Q3Q3 Q2Q2 Q4Q4 QnQn 22 33 kk A1A1  k-1 A2A2 X A k-2

16 u It is easy to check l Polynomial number of variables l Each Conditional Probability Table can be described by a small table (8 parameters at most) P(X = true) > 0 if and only if there exists a satisfying assignment to Q 1,…,Q n u Conclusion: polynomial reduction of 3-SAT

17 Inference is even #P-hard  P(X = t) is the fraction of satisfying assignments to   Hence 2 n P(X = t) is the number of satisfying assignments to   Thus, if we know to compute P(X = t), we know to count the number of satisfying assignments to . u Consequently, computing P(X = t) is #P-hard.

18 Hardness - Notes u We used deterministic relations in our construction  The same construction works if we use (1- ,  ) instead of (1,0) in each gate for any  < 0.5 Homework: Prove it. u Hardness does not mean we cannot solve inference l It implies that we cannot find a general procedure that works efficiently for all networks l For particular families of networks, we can have provably efficient procedures (e.g., trees, HMMs). l Variable elimination algorithms.

19 Approximation u Until now, we examined exact computation u In many applications, approximation are sufficient Example: P(X = x|e) = Maybe P(X = x|e)  0.3 is a good enough approximation e.g., we take action only if P(X = x|e) > 0.5 u Can we find good approximation algorithms?

20 Types of Approximations Absolute error  An estimate q of P(X = x | e) has absolute error , if P(X = x|e) -   q  P(X = x|e) +  equivalently q -   P(X = x|e)  q +  u Absolute error is not always what we want: If P(X = x | e) = , then an absolute error of is unacceptable If P(X = x | e) = 0.3, then an absolute error of is overly precise 0 1 q 22

21 Types of Approximations Relative error  An estimate q of P(X = x | e) has relative error , if P(X = x|e)(1 -  )  q  P(X = x|e)(1 +  ) equivalently q/(1 +  )  P(X = x|e)  q/(1 -  )  Sensitivity of approximation depends on actual value of desired result 0 1 q q/(1+  ) q/(1-  )

22 Complexity u Exact inference is hard. u Is approximate inference any easier?

23 Complexity: Relative Error  Suppose that q is a relative error estimate of P(X = t),  If  is not satisfiable, then P(X = t)=0. Hence, 0 = P(X = t)(1 -  )  q  P(X = t)(1 +  ) = 0 An immediate consequence: Thm: Given , finding an  -relative error approximation is NP- hard namely, q=0. Thus, if q > 0, then  is satisfiable

24 Complexity: Absolute error  We can find absolute error approximations to P(X = x) with high probability (via sampling). l We will see such algorithms next class. u However, once we have evidence, the problem is harder Thm  If  < 0.5, then finding an estimate of P(X=x|e) with  absulote error approximation is NP-Hard

25 Proof u Recall our construction... 11 Q1Q1 Q3Q3 Q2Q2 Q4Q4 QnQn 22 33 kk A1A1  k-1 A2A2 X...

26 Proof (cont.)  Suppose we can estimate with  absolute error  Let p 1  P(Q 1 = t | X = t) Assign q 1 = t if p 1 > 0.5, else q 1 = f Let p 2  P(Q 2 = t | X = t, Q 1 = q 1 ) Assign q 2 = t if p 2 > 0.5, else q 2 = f … Let p n  P(Q n = t | X = t, Q 1 = q 1, …, Q n-1 = q n-1 ) Assign q n = t if p n > 0.5, else q n = f

27 Proof (cont.) Claim: if  is satisfiable, then q 1,…, q n is a satisfying assignment  Suppose  is satisfiable  By induction on i there is a satisfying assignment with Q 1 = q 1, …, Q i = q i Base case: If Q 1 = t in all satisfying assignments,  P(Q 1 = t | X = t) = 1  p 1  1 -  > 0.5  q 1 = t If Q 1 = f, in all satisfying assignments, then q 1 = f Otherwise, the statement holds for any choice of q 1

28 Induction argument: If Q i+1 = t in all satisfying assignments s.t. Q 1 = q 1, …, Q i = q i  P(Q i+1 = t | X = t, Q 1 = q 1, …, Q i = q i ) = 1  p i+1  1 -  > 0.5  q i+1 = t If Q i+1 = f in all satisfying assignments s.t. Q 1 = q 1, …, Q i = q i then q i+1 = f Proof (cont.) Claim: if  is satisfiable, then q 1,…, q n is a satisfying assignment  Suppose  is satisfiable  By induction on i there is a satisfying assignment with Q 1 = q 1, …, Q i = q i

29 Proof (cont.)  We can efficiently check whether q 1,…, q n is a satisfying assignment (linear time) If it is, then  is satisfiable If it is not, then  is not satisfiable  Suppose we have an approximation procedure with  relative error   we can determine 3-SAT with n procedure calls u  approximation is NP-hard