Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9.

Similar presentations


Presentation on theme: "Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9."— Presentation transcript:

1 Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9

2 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 2 Outline Bayesian networks D-separation and independence Inference Russell & Norvig, sections 14.1 to 14.4

3 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 3 Recall the Story from FOL Anyone passing their 457 exam and winning the lottery is happy. Anyone who studies or is lucky can pass all their exams. Bob did not study but is lucky. Anyone who’s lucky can win the lottery. Is Bob happy?

4 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 4 Add Probabilities Anyone passing their 457 exam and winning the lottery has a 99% chance of being happy. Anyone only passing their 457 exam has an 80%, while someone only winning the lottery has a 60% chance of being happy, and someone who does neither has a 20% chance of being happy. Anyone who studies has a 90% chance of passing their exams. Anyone who’s lucky has a 50% chance of passing their exams. Anyone who’s both lucky and who studied has a 99% chance of passing, but someone who didn’t study and is unlucky has a 1% chance of passing. There’s a 20% chance that Bob studied, but a 75% chance that he’ll be lucky. Anyone who’s lucky has a 40% chance of winning the lottery, while an unlucky person only has a 1% chance of winning. What’s the probability of Bob being happy?

5 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 5 Probabilities in the Story Example of probabilities in the story P(Lucky) = 0.75 P(Study) = 0.2 P(PassExam|Study) = 0.9 P(PassExam|Lucky) = 0.5 P(Win|Lucky) = 0.4 P(Happy|PassExam,Win) = 0.99 Some variables directly affect others! Graphical representation of dependencies and conditional independencies between variables?

6 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 6 Bayesian Network Belief network Directed acyclic graph Nodes represent variables Edges represent conditional relationships Concise representation of any full joint probability distribution StudyPassExamLuckyWinHappy

7 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 7 Bayesian Network Nodes with no parents have prior probabilities Nodes with parents have conditional probability tables For all truth value combinations of their parents StudyPassExamLuckyWinHappy

8 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 8 Bayesian Network StudyPassExamLuckyWinHappy P(L) = 0.75P(S) = 0.2 LP(W) F0.01 P(W|  L) T0.4P(W|L) LSP(E) FF0.01 P(E|  L  S) TF0.5 P(E|L  S) FT0.9 P(E|  L  S) TT0.99 P(E|L  S) WEP(H) P(  H) FF0.20.8 TF0.60.4 FT0.80.2 TT0.990.01

9 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 9 Bayesian Network abcdgfejhikmnlopqrstuvwyxz

10 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 10 Chain Rule Recall the chain rule P(A,B) = P(A|B)P(B) P(A,B,C) = P(A|B,C)P(B,C) P(A,B,C) = P(A|B,C)P(B|C)P(C) P(A 1,A 2,…,A n ) = P(A 1 |A 2,…,A n )P(A 2 |A 3,…,A n )…P(A n-1 |A n )P(A n ) P(A 1,A 2,…,A n ) =  i=1 n P(A i |A i+1,…,A n )

11 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 11 Chain Rule If we know the value of a node’s parents, we don’t care about more distant ancestors Their influence is included through the parents A node is conditionally independent of its predecessors given its parents Or more generally, a node is conditionally independent of its non-descendents given its parents Update chain rule P(A 1,A 2,…,A n ) =  i=1 n P(A i |parents(A i ))

12 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 12 Chain Rule Example Probability that Bob is happy because he won the lottery and passed his exam, because he’s lucky but did not study P(H,W,E,L,  S) = P(H|W  E) * P(W|L) * P(E|L  S) * P(L) * P(  S) P(H,W,E,L,  S) = 0.99 * 0.4 * 0.5 * 0.75 * 0.8 P(H,W,E,L,  S) = 0.12

13 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 13 Constructing Bayesian Nets Build from the top- down Start with root nodes Add children Go down to leaves StudyPassExamLuckyWinHappy

14 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 14 Constructing Bayesian Nets What happens if we build with the wrong order? Network becomes needlessly complicated Node ordering is important! StudyPassExamLuckyWinHappy

15 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 15 Connections We can understand dependence in a network by considering how evidence is transmitted through it Information entered at one node Propagates to descendents and ancestors through connected nodes Provided no node in path already has evidence (in which case we would stop the propagation)

16 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 16 Serial Connection Study and Happy are dependent Study and Happy are independent given PassExam Intuitively, the only way Study can affect Happy is through PassExam StudyPassExamLuckyWinHappy

17 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 17 Converging Connection Lucky and Study are independent Lucky and Study are dependent given PassExam Intuitively, Lucky can be used to explain away Study StudyPassExamLuckyWinHappy

18 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 18 Diverging Connection Win and PassExams are dependent Win and PassExams are independent given Lucky Intuitively, Lucky can explain both Win and PassExam. Win and PassExam can affect each other by changing the belief in Lucky StudyPassExamLuckyWinHappy

19 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 19 D-Separation Determine if two variables are independent given some other variables X is independent of Y given Z if X and Y are d- separate given Z X is d-separate from Y if, for all (undirected) paths between X and Y, there exists a node Z for which: The connection is serial or diverging and there is evidence for Z The connection is converging and there is no evidence for Z or any of its descendents

20 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 20 D-Separation X Z Blocks path if in evidence YX YX Z Blocks path if not in evidence Y Z 2 Blocks path if not in evidence

21 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 21 D-Separation Can be computed in linear time using depth-first-search algorithm Fast algorithm to know if two nodes are independent Allows us to infer whether learning the value of a variable might give us information about another variable given what we already know All d-separated variables are independent but not all independent variable are d- separated

22 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 22 D-Separation Exercise If we observe a value for node g, what other nodes are updated? Nodes f, h and i If we observe a value for node a, what other nodes are updated? Nodes b, c, d, e, f abcdefghij

23 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 23 D-Separation Exercise Given an observation of c, are nodes a and f independent? Yes Given an observation of i, are nodes g and j independent? No abcdefghij

24 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 24 Other Independence Criteria bcdghiknlopsuvwyx m A node is conditionally independent of its non- descendents given its parents Recall from updated chain rule z

25 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 25 Other Independence Criteria bcdghiknlopsuvwyx m A node is conditionally independent of all others in the network given its parents, children, and children’s parents Markov blanket z

26 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 26 Inference in Bayesian Network Compute the posterior probability of a query variable given an observed event P(A 1,A 2,…,A n ) =  i=1 n P(A i |parents(A i )) Observed evidence variables E = E 1,…,E m Query variable X Between them: nonevidence (hidden) variables Y = Y 1 …Y l Belief network is X  E  Y

27 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 27 Inference in Bayesian Network P(X|E) Recall Bayes’ Theorem: P(A|B) = P(A,B) / P(B) P(X|E) = α P(X,E) Recall marginalization: P(A i ) =  j P(A i,B j ) P(X|E) = α  Y P(X,E,Y) Recall chain rule: P(A 1,A 2,…,A n ) =  i=1 n P(A i |parents(A i )) P(X|E) = α  Y  A=X E P(A|parents(A))

28 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 28 Inference Example StudyPassExamLuckyWinHappy P(L) = 0.75P(S) = 0.2 LP(W) F0.01 T0.4 LSP(E) FF0.01 TF0.5 FT0.9 TT0.99 WEP(H) FF0.2 TF0.6 FT0.8 TT0.99

29 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 29 Inference Example #1 With only the information from the network (and no observations), what’s the probability that Bob won the lottery? P(W) =  l P(W,l) P(W) =  l P(W|l)P(l) P(W) = P(W|L)P(L) + P(W|  L)P(  L) P(W) = 0.4*0.75 + 0.01*0.25 P(W) = 0.3025

30 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 30 Inference Example #2 Given that we know that Bob is happy, what’s the probability that Bob won the lottery? From the network, we know P(h,e,w,s,l) = P(l)P(s)P(e|l,s)P(w|l)P(h|w,e) We want to find P(W|H) = α  l  s  e P(l)P(s)P(e|l,s)P(W|l)P(H|W,e) P(  W|H) also needed to normalize

31 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 31 Inference Example #2 lseP(s)P(l)P(e|l,s)P(W|l)P(H|W,e) FFF0.80.250.990.010.60.001188 TFF0.80.750.50.40.60.072 FTF0.20.250.10.010.60.00003 TTF0.20.750.010.40.60.00036 FFT0.80.250.01 0.990.0000198 TFT0.80.750.50.40.990.1188 FTT0.20.250.90.010.990.0004455 TTT0.20.750.990.40.990.058806 P(W|H) = α 0.2516493

32 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 32 Inference Example #2 lseP(s)P(l)P(e|l,s) P(  W|l)P(H|  W,e) FFF0.80.250.99 0.20.039204 TFF0.80.750.50.60.20.036 FTF0.20.250.10.990.20.00099 TTF0.20.750.010.60.20.00018 FFT0.80.250.010.990.80.001584 TFT0.80.750.50.60.80.144 FTT0.20.250.90.990.80.03564 TTT0.20.750.990.60.80.07128 P(  W|H) = α 0.328878

33 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 33 Inference Example #2 P(W|H) = α P(W|H) = Note that P(  W|H) > P(W|H) because P(  W|  L)  P(W|  L) The probability of Bob having won the lottery has increased by 13.1% thanks to our knowledge that he is happy!

34 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 34 Expert Systems Bayesian networks used to implement expert systems Diagnostic systems that contains subject-specific knowledge Knowledge (nodes, relationships, probabilities) typically provided by human experts System observes evidence by asking questions to user, then infers most likely conclusion

35 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 35 Pathfinder Expert system for medical diagnostic of lymph-node diseases Very large Bayesian network Over 60 diseases Over 100 features of lymph nodes Over 30 features for clinical information Lot of work from medical experts 8 hours to define features and diseases 35 hours to build network topology 40 hours to assess probabilities

36 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 36 Pathfinder One node for each disease Assumes the diseases are mutually exclusive and exhaustive Large domain, hard to handle Several small networks for diagnostic tasks built individually Then combined into a single large network

37 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 37 Pathfinder Testing the network 53 test cases (real diagnostics) Diagnostic accuracy as good as a medical expert

38 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 38 Assumptions Learning agent Environment Fully observable / Partially observable Deterministic / Strategic / Stochastic Sequential Static / Semi-dynamic Discrete / Continuous Single agent / Multi-agent

39 ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 39 Assumptions Updated We can handle a new combination! Fully observable & Deterministic No uncertainty (map of Romania) Fully observable & Stochastic Games of chance (Monopoly, Backgammon) Partially observable & Deterministic Logic (Wumpus World) Partially observable & Stochastic


Download ppt "Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9."

Similar presentations


Ads by Google