Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty.

Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty

Topics in Artificial Intelligence 1/2 Dr hab. inż. Joanna Józefowska, prof. PP Probability of an event Classical method: If an experiment has n possible outcomes assign a probability of 1/n to each experimental outcome. Relative frequency method: Probability is the relative frequency of the number of events satisfying the constraints. Subjective method: Probability is a number characterising the likelihood of an event – degree of belief

Topics in Artificial Intelligence 1/3 Dr hab. inż. Joanna Józefowska, prof. PP Axioms of the probability theory Axiom I The probability value assigned to each experimental outcome must be between 0 and 1. Axiom II The sum of all the experimental outcome probabilities must be 1.

Topics in Artificial Intelligence 1/4 Dr hab. inż. Joanna Józefowska, prof. PP Conditional probability denoted by P(A|B) expresses belief that event A is true assuming that event B is true (events A and B are dependent) Definition Let the probability of event B be positive. Conditional probability of event A under condition B is calculated as follows:

Topics in Artificial Intelligence 1/5 Dr hab. inż. Joanna Józefowska, prof. PP Joint probability If events A 1, A 2,... Are mutually exclusive and cover the sample space , and P(A i ) > 0 for i = 1, 2,... then for any event B the following equality holds:

Topics in Artificial Intelligence 1/6 Dr hab. inż. Joanna Józefowska, prof. PP Bayes’ Theorem If the events A 1, A 2,... fulfil the assumptions of the joint probability theorem, and P(B) > 0, then for i =1, 2,... The following equality holds: Thomas Bayes (1701-1761)

Topics in Artificial Intelligence 1/7 Dr hab. inż. Joanna Józefowska, prof. PP Bayes’ Theorem Let us denote: H – hipothesis E – evidence The Bayes’ rule has the form: Prior probabilities New information Bayes’ theorem Posterior probabilities

Topics in Artificial Intelligence 1/8 Dr hab. inż. Joanna Józefowska, prof. PP Difficulties with joint probability distribution (tabular approach) the joint probability distribution has to be defined and stored in memory high computational effort required to calculate marginal and conditional probabilities

Topics in Artificial Intelligence 1/9 Dr hab. inż. Joanna Józefowska, prof. PP n sample points 2 n probabilities P(B,M)

Topics in Artificial Intelligence 1/10 Dr hab. inż. Joanna Józefowska, prof. PP Wymagania odnośnie do modelu niepewności w systemach regułowych W systemach wnioskowania logicznego reguła postaci A  B pozwala wywnioskować B, gdy tylko zachodzi A, niezależnie od innych faktów. W systemach probabilis- tycznych trzeba wziąć pod uwagę wszystkie dostępne przesłanki. Jeżeli przeprowadzimy dowód jakiejś tezy, to tezy tej można użyć w kolejnych dowodach bez potrzeby ponownego jej dowodzenia. W systemach probabilis- tycznych przesłanki użyte do dowodu mogą ulec zmianie. W logice prawdziwość zdań złożonych można wywnioskować na podstawie wartości logicznej termów. Wnioskowanie probabilistyczne nie zachowuje tej własności, chyba, że nałożymy silne ograniczenia o niezależności.

Topics in Artificial Intelligence 1/11 Dr hab. inż. Joanna Józefowska, prof. PP Certainty factor Buchanan, Shortliffe 1975 Model developed for the rule expert system MYCIN If E then H evidence (observation) hipothesis

Topics in Artificial Intelligence 1/12 Dr hab. inż. Joanna Józefowska, prof. PP Belief MB[H, E] – measure of the increase of belief that H is true based on observation E.

Topics in Artificial Intelligence 1/13 Dr hab. inż. Joanna Józefowska, prof. PP Disbelief MD[H, E] – measure of the increase of disbelief that H is true based on observation E.

Topics in Artificial Intelligence 1/14 Dr hab. inż. Joanna Józefowska, prof. PP Certainty factor CF  [–1, 1]

Topics in Artificial Intelligence 1/15 Dr hab. inż. Joanna Józefowska, prof. PP Interpretation of the certainty factor Certainty factor is associated with a rule: If evidence then hipothesis and denotes the change in belief that H is true after observation E. EH CF(H, E)

Topics in Artificial Intelligence 1/16 Dr hab. inż. Joanna Józefowska, prof. PP Uncertainty propagation E1E1 H CF(H, E 1 ) E2E2 CF(H, E 2 ) Parallel rules E 1, E 2 H CF(H, E 1 &E 2 )

Topics in Artificial Intelligence 1/17 Dr hab. inż. Joanna Józefowska, prof. PP Uncertainty propagation E1E1 H CF(E 2, E 1 ) E2E2 CF(H, E 2 ) Serial rules E1E1 H CF(H, E 1 ) If CF(H,  E 2 ) is not defined, it is assumed to be 0.

Topics in Artificial Intelligence 1/18 Dr hab. inż. Joanna Józefowska, prof. PP Certainty factor – probabilistic definition Heckerman 1986

Topics in Artificial Intelligence 1/19 Dr hab. inż. Joanna Józefowska, prof. PP Certainty measure EH CF(H, E) C(E) C(H) Grzymała-Busse 1991

Topics in Artificial Intelligence 1/20 Dr hab. inż. Joanna Józefowska, prof. PP Example 1 C(s1  s2) = min(0,2; – 0,1) = – 0,1 CF’(h, s1  s2) = 0,4 * 0 = 0 s1 h CF(h, s1  s2) = 0,4 s2 C(s1) = 0,2 C(s2) = – 0,1 C(h) = 0,3 C’(h) = 0,3 + (1– 0,3) * 0 = 0,3 + 0 = 0,3

Topics in Artificial Intelligence 1/21 Dr hab. inż. Joanna Józefowska, prof. PP Example 2 C(s1  s2) = min(0,2; 0,8) = 0,2 CF’(h, s1  s2) = 0,4 * 0,2 = 0,08 s1 h CF(h, s1  s2) = 0,4 s2 C(s1) = 0,2 C(s2) = 0,8 C(h) = 0,3 C’(h) = 0,3 + (1– 0,3) * 0,08 = 0,3 + 0,7 * 0,08 = 0,356

Topics in Artificial Intelligence 1/22 Dr hab. inż. Joanna Józefowska, prof. PP Dempster-Shafer theory Each hipothesis is characterised by two values: balief and plausibility. It models not only belief, but also the amount of acquired information.

Topics in Artificial Intelligence 1/23 Dr hab. inż. Joanna Józefowska, prof. PP Density probability function

Topics in Artificial Intelligence 1/24 Dr hab. inż. Joanna Józefowska, prof. PP Belief Belief Bel  [0,1] measures the value of acquired information supporting the belief that the considered set hipothesis is true.

Topics in Artificial Intelligence 1/25 Dr hab. inż. Joanna Józefowska, prof. PP Plausibility Plausibility Pl  [0,1] measures how much the belief that A is true is limited by evidence supporting  A.

Topics in Artificial Intelligence 1/26 Dr hab. inż. Joanna Józefowska, prof. PP Combining various sources of evidence Assume two sources of evidence: X and Y represented by respective subsets of  : X 1,...,X m and Y 1,...,Y n. Probability density functions m 1 and m 2 are defined on X and Y respectively. Combining observations from two sources a new value m 3 (Z) is calculated for each subset of  as follows:

Topics in Artificial Intelligence 1/27 Dr hab. inż. Joanna Józefowska, prof. PP Example  ={A, F, C, P} m 1 (  ) = 1 Observation 1 m 2 ({A, F, C}) = 0,6 m 2 (  ) = 0,4 m 1 (  ) = 1 m 2 ({A, F, C}) = 0,6 m 3 ({A, F, C}) = 0,6 m 2 (  ) = 0,4 m 3 (  ) = 0,4 A – allergy F – flu C – cold P - pneumonia

Topics in Artificial Intelligence 1/28 Dr hab. inż. Joanna Józefowska, prof. PP Example Observation 2 m 4 ({F,C,P}) = 0,8 m 4 (  ) = 0,2 m 4 ({F,C,P}) = 0,8 m 5 ({F,C}) = 0,48 m 4 (  ) = 0,2 m 5 ({A,F,C}) = 0,12 m 3 ({A, F, C}) = 0,6 m 3 (  ) = 0,4 m 3 ({A,F,C}) = 0,6 m 3 (  ) = 0,4 m 5 ({F,C,P}) = 0,32 m 5 (  ) = 0,08

Topics in Artificial Intelligence 1/29 Dr hab. inż. Joanna Józefowska, prof. PP Example Observation 3 m 6 ({A}) = 0,75 m 6 (  ) = 0,25 m 6 ({A}) = 0,75 m 7 (  ) = 0,36 m 6 (  ) = 0,25 m 7 ({F,C}) = 0,12 m 7 ({A}) = 0,09m 7 ({A,F,C}) = 0,03 m 5 ({F,C}) = 0,48m 5 ({A,F,C}) = 0,12 m 5 ({F,C,P}) = 0,32 m 5 (  ) = 0,08 m 5 ({F,C}) = 0,48 m 5 ({A,F,C}) = 0,12 m 5 ({F,C,P}) = 0,32 m 5 (  ) = 0,08 m 7 (  ) = 0,24 m 7 ({F,C,P}) = 0,08 m 7 ({A}) = 0,06 m 7 (  ) = 0,02 m 7 ({A}) = 0,15

Topics in Artificial Intelligence 1/30 Dr hab. inż. Joanna Józefowska, prof. PP Example Observation 3 m 6 ({A}) = 0,75 m 6 (  ) = 0,25 m 6 ({A}) = 0,75 m 7 (  ) = 0,36 m 6 (  ) = 0,25 m 7 ({F,C}) = 0,12 m 7 ({A}) = 0,09m 7 ({A,F,C}) = 0,03 m 5 ({F,C}) = 0,48m 5 ({A,F,C}) = 0,12 m 5 ({F,C,P}) = 0,32 m 5 (  ) = 0,08 m 5 ({F,C}) = 0,48 m 5 ({A,F,C}) = 0,12 m 5 ({F,C,P}) = 0,32 m 5 (  ) = 0,08 m 7 (  ) = 0,24 m 7 ({F,C,P}) = 0,08 m 7 ({A}) = 0,06 m 7 (  ) = 0,02 m 7 (  ) = 0,6

Topics in Artificial Intelligence 1/31 Dr hab. inż. Joanna Józefowska, prof. PP Example m 7 ({F,C}) = 0,12 m 7 ({A}) = 0,15 m 7 ({A,F,C}) = 0,03 m 7 ({F,C,P}) = 0,08 m 7 (  ) = 0,02 m 7 ({F,C}) = 0,3 m 7 ({A}) = 0,375 m 7 ({A,F,C}) = 0,075 m 7 ({F,C,P}) = 0,2 m 7 (  ) = 0,05 {A}: [0,375, 0,500] {F}: [0, 0,625] {C}: [0, 0,625] {P}: [0, 0,250] 1 – 0,3 – 0,2 1 – 0,375 1 – 0,375 – 0,3 – 0,075

Topics in Artificial Intelligence 1/32 Dr hab. inż. Joanna Józefowska, prof. PP Fuzzy sets (Zadeh) Rough sets (Pawlak)

Topics in Artificial Intelligence 1/33 Dr hab. inż. Joanna Józefowska, prof. PP Probabilistic reasoning alarm earthquake burglary JohnMary

Topics in Artificial Intelligence 1/34 Dr hab. inż. Joanna Józefowska, prof. PP Probabilistic reasoning B – burglary E – earthquake A – alarm J – John calls M – Mary calls Joint probability distribution – P(B,E,A,J,M) ?

Topics in Artificial Intelligence 1/35 Dr hab. inż. Joanna Józefowska, prof. PP Joint probability distribution

Topics in Artificial Intelligence 1/36 Dr hab. inż. Joanna Józefowska, prof. PP Probabilistic reasoning What is the probability of a burglary if Mary called? P(B=y|M=y) ? Marginal probability: Conditional probability:

Topics in Artificial Intelligence 1/37 Dr hab. inż. Joanna Józefowska, prof. PP Advantages of probabilistic reasoning Sound mathematical theory On the basis of the joint probability distribution one can reason about: –the reasons on the basis of the observed consequences, –consequences on the basis of given evidence, –Any combination of the above ones. Clear semantics based on the interpretation of probability. Model can be taught with statistical data.

Topics in Artificial Intelligence 1/38 Dr hab. inż. Joanna Józefowska, prof. PP Complexity of probabilistic reasoning in the „alarm” example –(2 5 – 1) = 31 values, –direct acces to unimportant information, e.g. P(B=1,E=1,A=1,J=1,M=1) –calculating any practical value, e.g. P(B=1|M=1) requires 29 elementary operations. in general –P(X 1,..., X n ) requires storing 2 n -1 values –difficult knowledge acquisition (not natural) –exponential complexity

Topics in Artificial Intelligence 1/39 Dr hab. inż. Joanna Józefowska, prof. PP Bayes’ theorem

Topics in Artificial Intelligence 1/40 Dr hab. inż. Joanna Józefowska, prof. PP Bayes’ theorem BA B depends on A P(B|A)

Topics in Artificial Intelligence 1/41 Dr hab. inż. Joanna Józefowska, prof. PP The chain rule P(X 1,X 2 ) = P(X 1 )P(X 2 |X 1 ) P(X 1,X 2,X 3 ) = P(X 1 )P(X 2 |X 1 )P(X 3 |X 1,X 2 )................................................................ P(X 1,X 2,...,X n ) = P(X 1 )P(X 2 |X 1 )...P(X n |X 1,...,X n-1 )

Topics in Artificial Intelligence 1/42 Dr hab. inż. Joanna Józefowska, prof. PP Conditional independence of variables in a domain In any domain one can define a set of variables pa(X i )  {X1,..., X i–1 } such that X i is independent of variables from the set {X 1,..., X i–1 } \ pa(X i ). Thus P(X i |X 1,..., X i – 1 ) = P(X i |pa(X i )) and P(X 1,..., X n ) =  P(X i |pa(X i )) i=1 n

Topics in Artificial Intelligence 1/43 Dr hab. inż. Joanna Józefowska, prof. PP Bayesian network B1B1 AB2B2 BnBn C1C1..... B i directly influences A CmCm..... P(A|B 1,..., B n )

Topics in Artificial Intelligence 1/44 Dr hab. inż. Joanna Józefowska, prof. PP Example alarm Mary calls John calls earthquake burglary burglary earthquake P(alarm|burglary, earthquake) true false true true 0.950 0.050 true false 0.9400.060 false true 0.2900.710 false false 0.0010.999

Topics in Artificial Intelligence 1/45 Dr hab. inż. Joanna Józefowska, prof. PP Example A M J E B P(B) 0.001 P(E) 0.002 B E P(A) TT0.950 TF0.940 FT0.290 FF0.001 A P(M) T0.70 F0.01 A P(J) T0.90 F0.05

Topics in Artificial Intelligence 1/46 Dr hab. inż. Joanna Józefowska, prof. PP Complexity of the representation Instead of 31 values it is enough to store 10. Easy construction of the model –Less parameters. –More intuitive parameters. Easy reasoning.

Topics in Artificial Intelligence 1/47 Dr hab. inż. Joanna Józefowska, prof. PP Bayesian networks Bayesian network is an acyclic directed graph which nodes represent formulas or variables in the considered domain, arcs represent dependence relation of variables, with related probability distributions.

Topics in Artificial Intelligence 1/48 Dr hab. inż. Joanna Józefowska, prof. PP Bayesian networks variable A with parent nodes pa(A) = {B 1,...,B n } conditional probablity table P(A|B 1,...,B n ) or P(A|pa(A)) if pa(A) =  a priori probability equals P(A)

Topics in Artificial Intelligence 1/49 Dr hab. inż. Joanna Józefowska, prof. PP Bayesian networks B1B1 AB2B2 BnBn B3B3..... pa(A) P(A|B 1, B 2,..., B n ) Event B i has no predecesors (pa(B i ) =  ) a priori probability P(B i ) B 1... B n P(A|B 1, B n ) T T0.18 T F0.12................................. F F0.28

Topics in Artificial Intelligence 1/50 Dr hab. inż. Joanna Józefowska, prof. PP Local semantics of Bayesian network Only direct dependence relations between variables. Local conditional probability distribution. Assumption about conditional independence of variables not bounded in the graph.

Topics in Artificial Intelligence 1/51 Dr hab. inż. Joanna Józefowska, prof. PP Global semantics of bayesian network Joint probability distribution given implicite. It can be calculated using the following rule:

Topics in Artificial Intelligence 1/52 Dr hab. inż. Joanna Józefowska, prof. PP Global semantics of bayesian network Node numbering: node index is smaller than indices of its predecessors. Finally: Bayesian network is a complete probabilistic model.

Topics in Artificial Intelligence 1/53 Dr hab. inż. Joanna Józefowska, prof. PP A2A2 Global probability distribution B1B1 A1A1 B2B2 BnBn B3B3..... P(A 1 |B 1,...B n ) P(A 2 |B 3,...B n ) pa(A 1 ) pa(A 2 ) B 1... B n A 1 A 2 P(A 1,A 2,B 1,...B n ) T... T T T T... T T F...................................................... F... F F F

Topics in Artificial Intelligence 1/54 Dr hab. inż. Joanna Józefowska, prof. PP Global probability distribution A1A1 B2B2 BnBn B3B3..... A2A2 pa(A 1 ) pa(A 2 ) B1B1 P(A 1 |B 1,...B n ) B 1... B n P(A 1 ) T... T 0.25 T... F.......................... F... F B1B1 B 1... B n A 1 A 2 P(A 1,A 2,B 1,...B n ) T... T T T 0.075 T... T T F...................................................... F... F F F B1B1 P(A 2 |B 3,...B n ) B 3... B n P(A 2 ) T... T 0.30 T... F.......................... F... F

Topics in Artificial Intelligence 1/55 Dr hab. inż. Joanna Józefowska, prof. PP Reasoning in Bayesian networks Updating evidence that a hipothesis H is true given some ecidence E, i.e. defining conditional probability distribution P(H|E). Two types of reasoning: probability of a single hipothesis probability of all hipothesis.

Topics in Artificial Intelligence 1/56 Dr hab. inż. Joanna Józefowska, prof. PP Example A M J E B P(B) 0.001 P(E) 0.002 B E P(A) TT0.950 TF0.940 FT0.290 FF0.001 A P(M) T0.70 F0.01 A P(J) T0.90 F0.05 John calls (J) and Mary calls (M). What is the probability that neither burglary nor earthquake occurred if the alarm rang?

Topics in Artificial Intelligence 1/64 Dr hab. inż. Joanna Józefowska, prof. PP Types of reasoning in Bayesian networks B J Evidence B occurs and we qould like to update probability of hipothesis J. Interpretation. There was a burglary, what is the probability that John will call? A P(J|B) = P(J|A)P(A|B) = 0.9 * 0.95 = 0.86 P(B) = 0.001 A P(J) T 0.90 F 0.05 B P(A) T 0.95 F 0.01

Topics in Artificial Intelligence 1/65 Dr hab. inż. Joanna Józefowska, prof. PP Types of reasoning in Bayesian networks Wnioskowanie diagnostyczne We observe J – what is the probability that B is true? Diagnosis. John calls. What is the probability of a burglary? B J A P(B|J) = P(J|B)*P(B)/P(J) = (0,95*0,9*0,001)/(0,9+0,05) = 0,0009 P(B) = 0.001 A P(J) T 0.90 F 0.05 B P(A) T 0.95 F 0.01 diagnostic

Topics in Artificial Intelligence 1/66 Dr hab. inż. Joanna Józefowska, prof. PP Types of reasoning in Bayesian networks BE We observe E. What is the probability that B is true? Alarm rang, so P(B|A) = 0.376, but if earthuake is observed as well then P(B|A,E) = 0.03 A

Topics in Artificial Intelligence 1/67 Dr hab. inż. Joanna Józefowska, prof. PP Types of reasoning in Bayesian networks E A J mixed We observe E and J What is the probability of A. John calls and we know that there was an earthquake. What is the probability that alarm rang? P(A|J,E) = 0.03

Topics in Artificial Intelligence 1/68 Dr hab. inż. Joanna Józefowska, prof. PP Types of reasoning in Bayesian networks H E b) E H a) HE c) E H E d) przyczynowe diagnostyczne między-przyczynowe mieszane

Topics in Artificial Intelligence 1/69 Dr hab. inż. Joanna Józefowska, prof. PP Multiply connected Bayesian network B1B1 A2A2 B2B2 BnBn C1C1.....CmCm...A1A1 AkAk

Topics in Artificial Intelligence 1/70 Dr hab. inż. Joanna Józefowska, prof. PP Summary Models of uncertainty: Certainty factor, certainty measure Dempster-Shafer theory Bayesian networks Fuzzy sets Raough sets

Topics in Artificial Intelligence 1/71 Dr hab. inż. Joanna Józefowska, prof. PP Summary Bayesian networks represent joint probability distribution. Reasoning in multiply connected BN is NP-hard. Exponential complexity may be avoided by: Constructing the net as a polytree Transforming a network to a polytree Approximate reasoning

Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty.

Similar presentations

Presentation on theme: "Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty.

Similar presentations

Presentation on theme: "Topics in Artificial Intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Modelling uncertainty."— Presentation transcript:

Similar presentations

About project

Feedback