Download presentation

Presentation is loading. Please wait.

Published byJessica Lucas Modified over 2 years ago

1
Reasoning under Uncertainty Department of Computer Science & Engineering Indian Institute of Technology Kharagpur

2
CSE, IIT Kharagpur2 Handling uncertain knowledge p Symptom(p, Toothache) Disease(p, Cavity) –Not correct, since toothache can be caused in many other cases – p Symptom(p, Toothache) Disease(p, Cavity) Disease(p, GumDisease) Disease(p, ImpactedWisdom) … p Disease(p, Cavity) Symptom(p, Toothache) –This is not correct either, since all cavities do not cause toothache

3
CSE, IIT Kharagpur3 Reasons for using probability Laziness –It is too much work to list the complete set of antecedents or consequents needed to ensure an exception-less rule Theoretical ignorance –The complete set of antecedents is not known Practical ignorance –The truth of the antecedents is not known, but we still wish to reason

4
CSE, IIT Kharagpur4 Axioms of Probability 1.All probabilities are between 0 and 1:0 P(A) 1 2.P(True) = 1 and P(False) = 0 3.P(A B) = P(A) + P(B) – P(A B) Bayes’ Rule P(A B) = P(A | B) P(B) P(A B) = P(B | A) P(A)

5
CSE, IIT Kharagpur5 Belief Networks A belief network is a graph in which the following holds: 1.A set of random variables makes up the nodes of the network 2.A set of random directed links or arrows connects pairs of nodes. The intuitive meaning of an arrow from node X to node Y is that X has a direct influence on Y 3.Each node has a conditional probability table that quantifies the effects that the parent have on the node. 4.The graph has no directed cycles (it is a DAG)

6
CSE, IIT Kharagpur6 Example Burglar alarm at your home –Fairly reliable at detecting a burglary –Also responds on occasion to minor earthquakes Two neighbors who, on hearing the alarm calls you at office –John always calls when he hears the alarm, but sometimes confuses the telephone ringing with the alarm and calls then, too. –Mary likes loud music and sometimes misses the alarm altogether

7
CSE, IIT Kharagpur7 Belief Network for the example Alarm BurglaryEarthquake JohnCallsMaryCalls AP(J) T0.90 F0.05 AP(M) T0.70 F0.01 BEP(A) TT0.95 TF FT0.29 FF0.001 P(B) 0.001 P(E) 0.002

8
CSE, IIT Kharagpur8 Representing the joint probability distribution A generic entry in the joint probability distribution P(x 1, …, x n ) is given by: Probability of the event that the alarm has sounded but neither a burglary nor an earthquake has occurred, and both Mary and John call: P(J M A B E) = P(J | A) P(M | A) P(A | B E) P( B) P( E) = 0.9 X 0.7 X 0.001 X 0.999 X 0.998 = 0.00062

9
CSE, IIT Kharagpur9 Conditional independence The belief network represents conditional independence:

10
CSE, IIT Kharagpur10 Incremental Network Construction 1.Choose the set of relevant variables X i that describe the domain 2.Choose an ordering for the variables (very important step) 3.While there are variables left: a)Pick a variable X and add a node to the network for it b)Set Parents(X) to some minimal set of nodes already in the net such that the conditional independence property is satisfied c)Define the conditional probability table for X

11
CSE, IIT Kharagpur11 Conditional Independence Relations If every undirected path from a node in X to a node in Y is d-separated by a given set of evidence nodes E, then X and Y are conditionally independent given E. A set of nodes E d-separates two sets of nodes X and Y if every undirected path from a node in X to a node in Y is blocked given E. A path is blocked given a set of nodes E if there is a node Z on the path for which one of three conditions holds: 1.Z is in E and Z has one arrow on the path leading in and one arrow out 2.Z is in E and Z has both path arrows leading out 3.Neither Z nor any descendant of Z is in E, and both path arrows lead in to Z

12
CSE, IIT Kharagpur12 Conditional Independence in belief networks Battery RadioIgnition Starts Petrol Whether there is petrol and whether the radio plays are independent given evidence about whether the ignition takes place Petrol and Radio are independent if it is known whether the battery works Petrol and Radio are independent given no evidence at all. But they are dependent given evidence about whether the car starts. If the car does not start, then the radio playing is increased evidence that we are out of petrol.

13
CSE, IIT Kharagpur13 Inferences using belief networks Diagnostic inferences (from effects to causes) –Given that JohnCalls, infer that P(Burglary | JohnCalls) = 0.016 Causal inferences (from causes to effects) –Given Burglary, P(JohnCalls | Burglary) = 0.86 and P(MaryCalls | Burglary) = 0.67 Intercausal inferences (between causes of a common effect) –Given Alarm, we have P(Burglary | Alarm) = 0.376. But if we add the evidence that Earthquake is true, then P(Burglary | Alarm Earthquake) goes down to 0.003 Mixed inferences (combining two or more of the above) –Setting the effect JohnCalls to true and the cause Earthquake to false gives P(Alarm | JohnCalls Earthquake) = 0.003

14
CSE, IIT Kharagpur14 The four patterns Q E Diagnostic E Q Causal QE InterCausal E Q E Mixed

15
CSE, IIT Kharagpur15 An algorithm for answering queries We consider cases where the belief network is a poly-tree –There is at most one undirected path between any two nodes X U1U1 UmUm Y1Y1 Z 1j Z nj YnYn U = U 1 … U m are parents of node X Y = Y 1 … Y n are children of node X X is the query variable E is a set of evidence variables The aim is to compute P(X | E)

16
CSE, IIT Kharagpur16 Definitions E X + is the causal support for X –The evidence variables “above” X that are connected to X through its parents E X – is the evidential support for X –The evidence variables “below” X that are connected to X through its children E Ui \ X refers to all the evidence connected to node U i except via the path from X E Yi \ X + refers to all the evidence connected to node Y i through its parents for X

17
CSE, IIT Kharagpur17 The computation of P(X|E) Since X d-separates E X + from E X – in the network, we can use conditional independence to simplify the first term in the numerator We can treat the denominator as a constant

18
CSE, IIT Kharagpur18 The computation of P(X | E X + ) We consider all possible configurations of the parents of X and how likely they are given E X +. Let U be the vector of parents U 1, …, U m, and let u be an assignment of values to them. Now U d-separates X from E X +, so we can simplify the first term to P(X | u) We can simplify the second term by noting that E X + d-separates each U i from the others, and that the probability of a conjunction of independent variables is equal to the product of their individual probabilities

19
CSE, IIT Kharagpur19 The computation of P(X | E X + ).. Contd… The last term can be simplified by partitioning E X + into E U1\X, …, E Um\X and using the fact that E Ui\X d-separates U i from all the other evidence in E X + P(X | u) is a lookup in the conditional probability table of X P(u i | E Ui\X ) is a recursive (smaller) instance of the original problem

20
CSE, IIT Kharagpur20 The computation of P(E X – | X) Let Z i be the parents of Y i other than X, and let z i be an assignment of of values to the parents The evidence in each Y i box is conditionally independent of the others given X Averaging over Y i and z i yields: Breaking E Yi\X into the two independent components E Yi – and E Yi\X +

21
CSE, IIT Kharagpur21 The computation of P(E X – | X) … contd Apply Bayes’ rule to P(E Yi\X + | z i ): E Yi – is independent of X and z i given y i, and E Yi\X + is independent of X and y i Rewriting the conjunction of Y i and z i :

22
CSE, IIT Kharagpur22 The computation of P(E X – | X) … contd The parents of Y i (the Z ij ) are independent of each other. We also combine the i into one single P(z i | X) = P(z i ) because Z and X are d-separated. Also P(E Yi\X + ) is a constant

23
CSE, IIT Kharagpur23 The computation of P(E X – | X) … contd P(E Yi – | y i ) is a recursive instance of P(E X – | X) P(y i | X, z i ) is a conditional probability table entry for Y i P(z ij | E Zij\Yi ) is a recursive instance of the P(X | E) calculation

24
CSE, IIT Kharagpur24 Inference in multiply connected belief networks Clustering methods –Transform the network into a probabilistically equivalent (but topologically different) poly-tree by merging offending nodes Conditioning methods –Instantiate variables to definite values, and then evaluate a poly- tree for each possible instantiation Stochastic simulation methods –Use the network to generate a large number of concrete models of the domain that are consistent with the network distribution. –They give an approximation of the exact evaluation.

25
CSE, IIT Kharagpur25 Default reasoning Some conclusions are made by default unless a counter- evidence is obtained –Non-monotonic reasoning Points to ponder –What is the semantic status of default rules? –What happens when the evidence matches the premises of two default rules with conflicting conclusions? –Sometimes a system may draw a number of conclusions on the basis of a belief that is later retracted. How can a system keep track of which conclusions need to be retracted as a result?

26
CSE, IIT Kharagpur26 Rule-based methods for uncertain reasoning Issues: Locality –In logical reasoning systems, if we have A B, then we can conclude B given evidence A, without worrying about any other rules. In probabilistic systems, we need to consider all available evidence. Detachment –Once a logical proof is found for proposition B, we can use it regardless of how it was derived (it can be detached from its justification). In probabilistic reasoning, the source of the evidence is important for subsequent reasoning. Truth functionality –In logic, the truth of complex sentences can be computed from the truth of the components. Probability combination does not work this way, except under strong independence assumptions. The most famous example of a truth functional system for uncertain reasoning is the certainty factors model, developed for the Mycin medical diagnostic program

27
CSE, IIT Kharagpur27 Dempster-Shafer Theory Designed to deal with the distinction between uncertainty and ignorance. We use a belief function Bel(X) – probability that the evidence supports the proposition When we do not have any evidence about X, we assign Bel(X) = 0 as well as Bel( X) = 0 For example, if we do not know whether a coin is fair, then: Bel( Heads ) = Bel( Heads ) = 0 If we are given that the coin is fair with 90% certainty, then: Bel( Heads ) = 0.9 X 0.5 = 0.45 Bel( Heads ) = 0.9 X 0.5 = 0.45 Note that we still have a gap of 0.1 that is not accounted for by the evidence

28
CSE, IIT Kharagpur28 Fuzzy Logic Fuzzy set theory is a means of specifying how well an object satisfies a vague description –Truth is a value between 0 and 1 –Uncertainty stems from lack of evidence, but given the dimensions of a man concluding whether he is fat has no uncertainty involved The rules for evaluating the fuzzy truth, T, of a complex sentence are T(A B) = min( T(A), T(B) ) T(A B) = max( T(A), T(B) ) T( A) = 1 T(A)

Similar presentations

OK

Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.

Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google