Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.

Similar presentations


Presentation on theme: ". Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides."— Presentation transcript:

1 . Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.

2 2 Bayesian Network p(t|v) Bayesian network = Directed Acyclic Graph (DAG), annotated with conditional probability distributions. V S L T A B XD p(x|a) p(d|a,b) p(a|t,l) p(b|s) p(l|s) p(s)p(s) p(v)p(v)

3 3 Bayesian Network (cont.) Each Directed Acyclic Graph defines a factorization of the form: p(t|v) V S L T A B XD p(x|a) p(d|a,b) p(a|t,l) p(b|s) p(l|s) p(s)p(s) p(v)p(v)

4 4 Local distributions Conditional Probability Table: p(A=y|L=n, T=n) = 0.02 p(A=y|L=n, T=y) = 0.60 p(A=y|L=y, T=n) = 0.99 p(A=y|L=y, T=y) = 0.99 Lung Cancer (Yes/No) Tuberculosis (Yes/No) Abnormality in Chest (Yes/no) p(A|T,L)

5 5 “Example” This model depicts the qualitative relations between the variables. We will now specify the joint distribution over these variables.

6 6 The “Visit-to-Asia” Example Visit to Asia Smoking Lung Cancer Tuberculosis Abnormality in Chest Bronchitis X-Ray Dyspnea

7 7 Queries There are many types of queries. Most queries involve evidence An evidence e is an assignment of values to a set E of variables in the domain P(Dyspnea = Yes | Visit_to_Asia = Yes, Smoking=Yes) P(Smoking=Yes | Dyspnea = Yes ) V S L T A B XD

8 8 Queries: A posteriori belief The conditional probability of a variable given the evidence This is the a posteriori belief in x, given evidence e Often we compute the term P(x, e) from which we can recover the a posteriori belief by Examples given in previous slide.

9 9 A posteriori belief This query is useful in many other cases: u Prediction: what is the probability of an outcome given the starting condition l Target is a descendent of the evidence (e.g., Does a visit to Asia lead to Tuberculosis ?) u Diagnosis: what is the probability of disease/fault given symptoms l Target is an ancestor of the evidence (e.g., Does the X- ray results indicate higher probability of Tuberculosis ?) V S L T A B XD

10 10 Example: Predictive+Diagnostic P(T = Yes | Visit_to_Asia = Yes, Dyspnea = Yes ) V S L T A B XD Probabilistic inference can combine evidence form all parts of the network, Diagnostic and Predictive, regardless of the directions of edges in the model.

11 11 Queries: MAP  Find the maximum a posteriori assignment for some variable of interest (say H 1,…,H l )  That is, h 1,…,h l maximize the conditional probability P(h 1,…,h l | e)  Equivalent to maximizing the joint P(h 1,…,h l, e)

12 12 Queries: MAP We can use MAP for: u Explanation l What is the most likely joint event, given the evidence (e.g., a set of likely diseases given the symptoms) l What is the most likely scenario, given the evidence (e.g., a series of likely malfunctions that trigger a fault). D1 D2 S2 S1 D3 D4 S4 S3 Dead battery Not charging Bad battery Bad magneto Bad alternator

13 13 Complexity of Inference Thm: Computing P(X = x) in a Bayesian network is NP- hard Not surprising, since we can simulate Boolean gates.

14 14 Proof We reduce 3-SAT to Bayesian network computation Assume we are given a 3-SAT problem:  Q 1,…,Q n be propositions,   1,...,  k be clauses, such that  i = l i1  l i2  l i3 where each l ij is a literal over Q 1,…,Q n (e.g., Q 1 = true ) u  =  1 ...  k We will construct a Bayesian network s.t. P(X=t) > 0 iff  is satisfiable

15 15...  P(Q i = true) = 0.5,  P(  I = true | Q i, Q j, Q l ) = 1 iff Q i, Q j, Q l satisfy the clause  I  A 1, A 2, …, are simple binary AND gates... 11 Q1Q1 Q3Q3 Q2Q2 Q4Q4 QnQn 22 33 kk A1A1  k-1 A2A2 X A k-2

16 16 u It is easy to check l Polynomial number of variables l Each Conditional Probability Table can be described by a small table (8 parameters at most) P(X = true) > 0 if and only if there exists a satisfying assignment to Q 1,…,Q n u Conclusion: polynomial reduction of 3-SAT

17 17 Inference is even #P-hard  P(X = t) is the fraction of satisfying assignments to   Hence 2 n P(X = t) is the number of satisfying assignments to   Thus, if we know to compute P(X = t), we know to count the number of satisfying assignments to . u Consequently, computing P(X = t) is #P-hard.

18 18 Hardness - Notes u We used deterministic relations in our construction  The same construction works if we use (1- ,  ) instead of (1,0) in each gate for any  < 0.5 Homework: Prove it. u Hardness does not mean we cannot solve inference l It implies that we cannot find a general procedure that works efficiently for all networks l For particular families of networks, we can have provably efficient procedures (e.g., trees, HMMs). l Variable elimination algorithms.

19 19 Approximation u Until now, we examined exact computation u In many applications, approximation are sufficient Example: P(X = x|e) = 0.3183098861838 Maybe P(X = x|e)  0.3 is a good enough approximation e.g., we take action only if P(X = x|e) > 0.5 u Can we find good approximation algorithms?

20 20 Types of Approximations Absolute error  An estimate q of P(X = x | e) has absolute error , if P(X = x|e) -   q  P(X = x|e) +  equivalently q -   P(X = x|e)  q +  u Absolute error is not always what we want: If P(X = x | e) = 0.0001, then an absolute error of 0.001 is unacceptable If P(X = x | e) = 0.3, then an absolute error of 0.001 is overly precise 0 1 q 22

21 21 Types of Approximations Relative error  An estimate q of P(X = x | e) has relative error , if P(X = x|e)(1 -  )  q  P(X = x|e)(1 +  ) equivalently q/(1 +  )  P(X = x|e)  q/(1 -  )  Sensitivity of approximation depends on actual value of desired result 0 1 q q/(1+  ) q/(1-  )

22 22 Complexity u Exact inference is hard. u Is approximate inference any easier?

23 23 Complexity: Relative Error  Suppose that q is a relative error estimate of P(X = t),  If  is not satisfiable, then P(X = t)=0. Hence, 0 = P(X = t)(1 -  )  q  P(X = t)(1 +  ) = 0 An immediate consequence: Thm: Given , finding an  -relative error approximation is NP- hard namely, q=0. Thus, if q > 0, then  is satisfiable

24 24 Complexity: Absolute error  We can find absolute error approximations to P(X = x) with high probability (via sampling). l We will see such algorithms next class. u However, once we have evidence, the problem is harder Thm  If  < 0.5, then finding an estimate of P(X=x|e) with  absulote error approximation is NP-Hard

25 25 Proof u Recall our construction... 11 Q1Q1 Q3Q3 Q2Q2 Q4Q4 QnQn 22 33 kk A1A1  k-1 A2A2 X...

26 26 Proof (cont.)  Suppose we can estimate with  absolute error  Let p 1  P(Q 1 = t | X = t) Assign q 1 = t if p 1 > 0.5, else q 1 = f Let p 2  P(Q 2 = t | X = t, Q 1 = q 1 ) Assign q 2 = t if p 2 > 0.5, else q 2 = f … Let p n  P(Q n = t | X = t, Q 1 = q 1, …, Q n-1 = q n-1 ) Assign q n = t if p n > 0.5, else q n = f

27 27 Proof (cont.) Claim: if  is satisfiable, then q 1,…, q n is a satisfying assignment  Suppose  is satisfiable  By induction on i there is a satisfying assignment with Q 1 = q 1, …, Q i = q i Base case: If Q 1 = t in all satisfying assignments,  P(Q 1 = t | X = t) = 1  p 1  1 -  > 0.5  q 1 = t If Q 1 = f, in all satisfying assignments, then q 1 = f Otherwise, the statement holds for any choice of q 1

28 28 Induction argument: If Q i+1 = t in all satisfying assignments s.t. Q 1 = q 1, …, Q i = q i  P(Q i+1 = t | X = t, Q 1 = q 1, …, Q i = q i ) = 1  p i+1  1 -  > 0.5  q i+1 = t If Q i+1 = f in all satisfying assignments s.t. Q 1 = q 1, …, Q i = q i then q i+1 = f Proof (cont.) Claim: if  is satisfiable, then q 1,…, q n is a satisfying assignment  Suppose  is satisfiable  By induction on i there is a satisfying assignment with Q 1 = q 1, …, Q i = q i

29 29 Proof (cont.)  We can efficiently check whether q 1,…, q n is a satisfying assignment (linear time) If it is, then  is satisfiable If it is not, then  is not satisfiable  Suppose we have an approximation procedure with  relative error   we can determine 3-SAT with n procedure calls u  approximation is NP-hard


Download ppt ". Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides."

Similar presentations


Ads by Google