Presentation is loading. Please wait.

Presentation is loading. Please wait.

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.

Similar presentations


Presentation on theme: "BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana."— Presentation transcript:

1 BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana

2 BAYESIAN NETWORKS  Bayesian networks, or belief networks: an approach to handling uncertainty in knowledge-based systems  Mathematically well-founded in probability theory, unlike many other, earlier approaches to representing uncertain knowledge  Type of problems intended for belief nets: given that some things are known to be true, how likely are some other events?

3 BURGLARY EXAMPLE  We have an alarm system to warn about burglary.  We have received an automatic alarm phone call; how likely it is that there actually was a burglary?  We cannot tell about burglary for sure, but characterize it probabilistically instead

4 BURGLARY EXAMPLE  There are a number of events involved: burglary sensor that may be triggered by burglar lightning that may also trigger the sensor alarm that may be triggered by sensor call that may be triggered by sensor

5 BAYES NET REPRESENTATION  There are variables (e.g. burglary, alarm) that can take values (e.g. alarm = true, burglary = false).  There are probabilistic relations among variables, e.g.: if burglary = true then it is more likely that alarm = true

6 EXAMPLE BAYES NET burglary lightning sensor alarm call

7 PROBABILISTC DEPENDENCIES AND CAUSALITY  Belief networks define probabilistic dependencies (and independencies) among the variables  They may also reflect causality (burglar triggers sensor)

8 EXAMPLE OF REASONING IN BELIEF NETWORK  In normal situation, burglary is not very likely.  We receive automatic warning call; since sensor causes warning call, the probability of sensor being on increases; since burglary is a cause for triggering the sensor, the probability of burglary increases.  Then we learn there was a storm. Lightning may also trigger sensor. Since lightning now also explains how the call happened, the probability of burglary decreases.

9 TERMINOLOGY Bayes network = belief network = probabilistic network = causal network

10 BAYES NETWORKS, DEFINITION  Bayes net is a DAG (direct acyclic graph)  Nodes ~ random variables  Link X Y intuitively means: “X has direct influence on Y”  For each node: conditional probability table quantifying effects of parent nodes

11 MAJOR PROBLEM IN HANDLING UNCERTAINTY  In general, with uncertainty, the problem is the handling of dependencies between events.  In principle, this can be handled by specifying the complete probability distribution over all possible combinations of variable values.  However, this is impractical or impossible: for n binary variables, 2 n - 1 probabilities - too many!  Belief networks enable that this number can usually be reduced in practice

12 BURGLARY DOMAIN  Five events: B, L, S, A, C  Complete probability distribution: p( B L S A C) =... p( ~B L S A C) =... p( ~B ~L S A C) =... p( ~B L ~S A C) =......  Total: 32 probabilities

13 WHY BELIEF NETS BECAME SO POPULAR?  If some things are mutually independent then not all conditional probabilities are needed. p(XY) = p(X) p(Y|X), p(Y|X) needed  If X and Y independent: p(XY) = p(X) p(Y), p(Y|X) not needed!  Belief networks provide an elegant way of stating independences

14 EXAMPLE FROM J. PEARL Burglary Earthquake Alarm John calls Mary calls  Burglary causes alarm  Earthquake cause alarm  When they hear alarm, neighbours John and Mary phone  Occasionally John confuses phone ring for alarm  Occasionally Mary fails to hear alarm

15 PROBABILITIES P(B) = 0.001, P(E) = 0.002 A P(J | A) A P(M | A) T 0.90 T 0.70 F 0.05 F 0.01 B E P(A | BE) T T 0.95 T F 0.95 F T 0.29 F F 0.001

16 HOW ARE INDEPENDENCIES STATED IN BELIEF NETS A B C D If C is known to be true, then prob. of D independent of A, B p( D | A B C) = p( D | C)

17 A1, A2,..... non-descendants of C B1 B2... parents of C C D1, D2,... descendants of C C is independent of C's non-descendants given C's parents p( C | A1,..., B1,..., D1,...) = p( C | B1,..., D1,...)

18 INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE EXAMPLE a parent of c b c e nondescendants of c d f descendant of c By applying rule about nondescendants: p(c|ab) = p(c|b) Because: c independent of c's nondesc. a given c's parents (node b)

19 INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE But, for this Bayesian network: p(c|bdf)  p(c|bd) Athough f is c's nondesc., it cannot be ignored: knowing f, e becomes more likely; e may also cause d, so when e becomes more likely, c becomes less likely. Problem is that descendant d is given.

20 SAFER FORMULATION OF INDEPENDENCE C is independent of C's nondescendants given C's parents (only) and not C's descendants.

21 STATING PROBABILITIES IN BELIEF NETS For each node X with parents Y1, Y2,..., specify conditional probabilities of form: p( X | Y1  Y2 ...) for all possible states of Y1, Y2,... Y1 Y2 X Specify: p( X | Y1, Y2) p( X | ~Y1, Y2) p( X | Y1, ~Y2) p( X | ~Y1, ~Y2)

22 BURGLARY EXAMPLE p(burglary) = 0.001 p(lightning) = 0.02 p(sensor | burglary  lightning) = 0.9 p(sensor | burglary  ~lightning) = 0.9 p(sensor | ~burglary  lightning) = 0.1 p(sensor | ~burglary  ~lightning) = 0.001 p(alarm | sensor) = 0.95 p(alarm | ~sensor) = 0.001 p(call | sensor) = 0.9 p(call | ~sensor) = 0.0

23 BURGLARY EXAMPLE 10 numbers plus structure of network are equivalent to 2 5 - 1= 31 numbers required to specify complete probability distribution (without structure information).

24 EXAMPLE QUERIES FOR BELIEF NETWORKS  p( burglary | alarm) = ?  p( burglary  lightning) = ?  p( burglary | alarm  ~lightning) = ?  p( alarm  ~call | burglary) = ?

25 Probabilistic reasoning in belief nets Easy in forward direction, from ancestors to descendents, e.g.: p( alarm | burglary  lightning) = ? In backward direction, from descendants to ancestors, apply Bayes' formula p( B | A) = p(B) * p(A | B) / p(A)

26 BAYES' FORMULA A variant of Bayes' formula to reason about probability of hypothesis H given evidence E in presence of background knowledge B:

27 REASONING RULES 1. Probability of conjunction: p( X1  X2 | Cond) = p( X1 | Cond) * p( X2 | X1  Cond) 2. Probability of a certain event: p( X | Y1 ...  X ...) = 1 3. Probability of impossible event: p( X | Y1 ...  ~X ...) = 0 4. Probability of negation: p( ~X | Cond) = 1 – p( X | Cond)

28 5. If condition involves a descendant of X then use Bayes' theorem: If Cond0 = Y  Cond where Y is a descendant of X in belief net then p(X|Cond0) = p(X|Cond) * p(Y|X  Cond) / p(Y|Cond) 6. Cases when condition Cond does not involve a descendant of X: (a) If X has no parents then p(X|Cond) = p(X), p(X) given (b) If X has parents Parents then

29 A SIMPLE IMPLEMENTATION IN PROLOG In: I. Bratko, Prolog Programming for Artificial Intelligence, Third edition, Pearson Education 2001(Chapter 15) An interaction with this program: ?- prob( burglary, [call], P). P = 0.232137 Now we learn there was a heavy storm, so: ?- prob( burglary, [call, lightning], P). P = 0.00892857

30 Lightning explains call, so burglary seems less likely. However, if the weather was fine then burglary becomes more likely: ?- prob( burglary, [call,not lightning],P). P = 0.473934

31 COMMENTS  Complexity of reasoning in belief networks grows exponentially with the number of nodes.  Substantial algorithmic improvements required for large networks for improved efficiency.

32 d-SEPARATION  Follows from basic independence assumption of Bayes networks  d-separation = direction-dependent separation  Let E = set of “evidence nodes” (subset of variables in Bayes network)  Let V i, V j be two variables in the network

33 d-SEPARATION  Nodes V i and V j are conditionally independent given set E if E d-separates V i and V j  E d-separates V i, V j if all (undirected) paths (V i,V j ) are “blocked” by E  If E d-separates V i, V j, then V i and V j are conditionally independent, given E  We write I(V i,V j | E)  This means: p(V i,V j | E) = p(V i | E) * p(V j | E)

34 BLOCKING A PATH A path between V i and V j is blocked by nodes E if there is a “blocking node” V b on the path. V b blocks the path if one of the following holds:  V b in E and both arcs on path lead out of V b, or  V b in E and one arc on path leads into V b and one out, or  neither V b nor any descendant of V b is in E, and both arcs on path lead into V b

35 CONDITION 1 V b is a common cause: V b V i V j

36 CONDITION 2  V b is a “closer, more direct cause” of V j than V i is V i Vb V j

37 CONDITION 3  V b is not a common consequence of V i, V j V i V j V b V b not in E V d V d not in E


Download ppt "BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana."

Similar presentations


Ads by Google