Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

Similar presentations


Presentation on theme: "Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University."— Presentation transcript:

1 Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University

2 Bayesian Networks2 Course Outline Part I – Introduction to Artificial Intelligence Part II – Classical Artificial Intelligence Part III – Machine Learning Introduction to Machine Learning Neural Networks Probabilistic Reasoning and Bayesian Belief Networks Artificial Life: Learning through Emergent Behavior Part IV – Advanced Topics

3 Bayesian Networks3 Learning Objectives

4 Bayesian Networks4 Unit Outline 1. Introduction Introduction 2. Bayesian networks Bayesian networks 3. Summary Summary

5 Bayesian Networks5 1. Introduction A systematic way, called Bayesian networks, to represent the world probabilistically – independence and conditional independence relationships Application areas: consumer help desks, nuclear reactor diagnosis, tissue pathology, pattern recognition, credit assessment, computer network diagnosis,... Representing knowledge in an uncertain domain Semantics of Bayesian networks Example: Topics

6 Bayesian Networks6 2. Bayesian networks 1. Definitions Definitions 2. Example – Weather, Cavity, Catch, and Toothache Example – Weather, Cavity, Catch, and Toothache 3. Example – Burglary net Example – Burglary net 4. How to answer queries in the Burglary net How to answer queries in the Burglary net 5. How to construct Bayesian networks How to construct Bayesian networks

7 Bayesian Networks7 Definitions Bayesian network, belief network, probabilistic network, causal network, or also called knowledge map: A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax: a set of nodes, one per random variable a directed, acyclic graph (link means “a parent node directly influences its child nodes.”) a conditional probability distribution for each node given its parents: P (X i | Parents (X i )) In the simplest case, conditional probability distribution represented as a conditional probability table (CPT) giving the distribution over X i for each combination of parent values 2.

8 Bayesian Networks8 Example – Weather, Toothache, C, C Topology of network encodes conditional independence assertions: Considering Weather, Toothache, Cavity, and Catch, Weather is independent of the other variables. [Q] What else does not get any influence from others? Cavity influences Toothache and Catch. Toothache and Catch are conditionally independent given Cavity. 2.

9 Bayesian Networks9 Example - Burglary I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set off by minor earthquakes. Is there a burglar? [Q] Random variables? Burglary, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects "causal" or “influence” knowledge. [Q] In what way? A burglar can set the alarm off. An earthquake can set the alarm off. The alarm can cause Mary to call. The alarm can cause John to call. [Q] Can you draw the network then?

10 Bayesian Networks10 What is P(JohnCalls=T)? P(j) = P(j  a) + ??? = ??? + ??? = ??? What is P(Alarm=T)? P(a) = P(a  b  e) + P(a  b  e) + P(a  b  e) + P(a  b  e) = P(a|b  e) P(b  e) + P(a|b  e) P(b  e) + P(a|  b  e) P(  b  e) + P(a|  b  e) P(  b  e) = ??? John calls to say my alarm is ringing. P(a | b   e) [Q] What does it mean?

11 Bayesian Networks11 Compactness A CPT for Boolean X i with k Boolean parents has 2 k rows for the combinations of parent values. Each row requires one number p for X i = true (the number for X i = false is just 1-p) If each variable has no more than k parents, the complete network requires O(n · 2 k ) numbers I.e., grows linearly with n, vs. O(2 n ) for the full joint distribution But, for burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 2 5 -1 = 31) 2.

12 Bayesian Networks12 How to answer queries in Burglary net The full joint distribution is defined as the product of the local conditional distributions: P(X 1, …, X n ) = π i = 1 P(X i | Parents(X i )) [Q] P(j  m  a   b   e) = ??? = P(j | a) P(m | a) P(a |  b,  e) P(  b) P(  e) =.9 *.7 *.001 *.999 *.998 ≈.00063 n

13 Bayesian Networks13 I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set off by minor earthquakes. What are the chances that the alarm was ringing and there is a burglar? P(j   m  a  b) = ??? [Q] How to deal with Earthquake? = P(j   m  a  b  e) + P(j   m  a  b   e) = P(j | a) P(  m | a) P(a | (b  e)) P(b) P(e) + P(j | a) P(  m | a) (P(a | (b  e)) P(b) P(  e) = P(j | a) (1 - P(m | a)) P(a | (b  e)) P(b) P(e) + P(j | a) P(  m | a) (P(a | (b  e)) P(b) P(  e) = ??? Mary calls to say alarm is ringing. what are the changes there is an earthquake? P(X 1, …, X n ) =  P(X i | Parents(X i ))

14 Bayesian Networks14 P(a) = ??? = P(a  b  e) + P(a  b   e) + P(a   b  e) + P(a   b   e) = P(a | (b  e)) P(b) P(e) + P(a | (b  e)) P(b) P(  e) + P(a | (  b  e)) P(  b) P(e) + P(a | (  b  e)) P(  b) P(  e)) = … P(j) = ??? P(X 1, …, X n ) =  P(X i | Parents(X i )) 2.

15 Bayesian Networks15 How to construct Bayesian networks 1. Choose an ordering of random variables X 1, …, X n 2. For i = 1 to n add X i to the network select parents from X 1, …, X i-1 such that P (X i | Parents(X i )) = P (X i | X 1,..., X i-1 ) i.e., Parents(X i ) are conditionally independent of the others. This choice of parents guarantees: P (X 1, …, X n ) = π i =1 P (X i | X 1, …, X i-1 ) (by chain rule) = π i =1 P (X i | Parents(X i )) (by construction) [Q] Which random variables first? n n

16 Bayesian Networks16 Suppose we choose the ordering M, J, A, B, E Example of a wrong order

17 Bayesian Networks17 Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)?

18 Bayesian Networks18 Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)?

19 Bayesian Networks19 Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? P(B | A, J, M) = P(B)?

20 Bayesian Networks20 Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A, J, M) = P(E | A)? P(E | B, A, J, M) = P(E | A, B)?

21 Bayesian Networks21 Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A, J, M) = P(E | A)? No P(E | B, A, J, M) = P(E | A, B)? Yes

22 Bayesian Networks22 Deciding conditional independence is hard in noncausal directions. (Causal models and conditional independence seem hardwired for humans!) Accessing conditional probabilities is hard in noncasual directions Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed

23 Bayesian Networks23 How to choose a correct order? “Root causes” first, [Q] What can be a root? then the variables they directly influence, and so on, until the leaves Can we use causal relations? For each node, find child nodes (i.e., the child nodes caused by the node); Example of Weather, Toothache, Catch, and Cavity Topics 2.

24 Bayesian Networks24 Summary Bayesian networks provide a natural representation for (causally induced) conditional independence. Topology + CPTs = compact representation of joint distribution Generally easy for domain experts to construct Topics


Download ppt "Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University."

Similar presentations


Ads by Google