Download presentation

Presentation is loading. Please wait.

Published byKaiden Sill Modified over 2 years ago

1
Bayesian Networks CSE 473

2
© Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence & conditional independence Bayes’ rule Bayesian networks Statistical learning Dynamic Bayesian networks (DBNs) Markov decision processes (MDPs)

3
© Daniel S. Weld 3 Axioms of Probability Theory All probabilities between 0 and 1 0 ≤ P(A) ≤ 1 P(true) = 1 P(false) = 0. The probability of disjunction is: A B A B True

4
© Daniel S. Weld 4 Conditional Probability P(A | B) is the probability of A given B Assumes that B is the only info known. Defined by: A B ABAB True

5
© Daniel S. Weld 5 Inference by Enumeration P(toothache)=.108+.012+.016+.064 =.20 or 20%

6
© Daniel S. Weld 6 Inference by Enumeration

7
© Daniel S. Weld 7 Problems ?? Worst case time: O(n d ) Where d = max arity And n = number of random variables Space complexity also O(n d ) Size of joint distribution How get O(n d ) entries for table?? Value of cavity & catch irrelevant - When computing P(toothache)

8
© Daniel S. Weld 8 Independence A and B are independent iff: These two constraints are logically equivalent Therefore, if A and B are independent:

9
© Daniel S. Weld 9 Independence True B AA B

10
© Daniel S. Weld 10 Independence Complete independence is powerful but rare What to do if it doesn’t hold?

11
© Daniel S. Weld 11 Conditional Independence True B AA B A&B not independent, since P(A|B) < P(A)

12
© Daniel S. Weld 12 Conditional Independence True B AA B C B C ACAC But: A&B are made independent by C P(A| C) = P(A|B, C)

13
© Daniel S. Weld 13 Conditional Independence Instead of 7 entries, only need 5

14
© Daniel S. Weld 14 Conditional Independence II P(catch | toothache, cavity) = P(catch | cavity) P(catch | toothache, cavity) = P(catch | cavity) Why only 5 entries in table?

15
© Daniel S. Weld 15 Power of Cond. Independence Often, using conditional independence reduces the storage complexity of the joint distribution from exponential to linear!! Conditional independence is the most basic & robust form of knowledge about uncertain environments.

16
© Daniel S. Weld 16 Bayes Rule Simple proof from def of conditional probability: QED: (Def. cond. prob.) (Mult by P(H) in line 1) (Substitute #3 in #2)

17
© Daniel S. Weld 17 Use to Compute Diagnostic Probability from Causal Probability E.g. let M be meningitis, S be stiff neck P(M) = 0.0001, P(S) = 0.1, P(S|M)= 0.8 P(M|S) =

18
© Daniel S. Weld 18 Bayes’ Rule & Cond. Independence

19
© Daniel S. Weld 19 Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide a graphical representation of conditional independence relations in P usually quite compact requires assessment of fewer parameters, those being quite natural (e.g., causal) efficient (usually) inference: query answering and belief update

20
© Daniel S. Weld 20 Independence (in the extreme) If X 1, X 2,... X n are mutually independent, then P(X 1, X 2,... X n ) = P(X 1 )P(X 2 )... P(X n ) Joint can be specified with n parameters cf. the usual 2 n -1 parameters required While extreme independence is unusual, Conditional independence is common BNs exploit this conditional independence

21
© Daniel S. Weld 21 An Example Bayes Net Earthquake BurglaryAlarm Nbr2CallsNbr1Calls Pr(B=t) Pr(B=f) 0.05 0.95 Pr(A|E,B) e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Radio

22
© Daniel S. Weld 22 Earthquake Example (con’t) If I know if Alarm, no other evidence influences my degree of belief in Nbr1Calls P(N1|N2,A,E,B) = P(N1|A) also: P(N2|N2,A,E,B) = P(N2|A) and P(E|B) = P(E) By the chain rule we have P(N1,N2,A,E,B) = P(N1|N2,A,E,B) ·P(N2|A,E,B)· P(A|E,B) ·P(E|B) ·P(B) = P(N1|A) ·P(N2|A) ·P(A|B,E) ·P(E) ·P(B) Full joint requires only 10 parameters (cf. 32) Earthquake Burglary Alarm Nbr2CallsNbr1Calls Radio

23
© Daniel S. Weld 23 BNs: Qualitative Structure Graphical structure of BN reflects conditional independence among variables Each variable X is a node in the DAG Edges denote direct probabilistic influence usually interpreted causally parents of X are denoted Par(X) X is conditionally independent of all nondescendents given its parents Graphical test exists for more general independence “Markov Blanket”

24
© Daniel S. Weld 24 Given Parents, X is Independent of Non-Descendants

25
© Daniel S. Weld 25 For Example EarthquakeBurglary Alarm Nbr2CallsNbr1Calls Radio

26
© Daniel S. Weld 26 Given Markov Blanket, X is Independent of All Other Nodes MB(X) = Par(X) Childs(X) Par(Childs(X))

27
© Daniel S. Weld 27 Conditional Probability Tables Earthquake BurglaryAlarm Nbr2CallsNbr1Calls Pr(B=t) Pr(B=f) 0.05 0.95 Pr(A|E,B) e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Radio

28
© Daniel S. Weld 28 Conditional Probability Tables For complete spec. of joint dist., quantify BN For each variable X, specify CPT: P(X | Par(X)) number of params locally exponential in |Par(X)| If X 1, X 2,... X n is any topological sort of the network, then we are assured: P(X n,X n-1,...X 1 ) = P(X n | X n-1,...X 1 ) · P(X n-1 | X n-2,… X 1 ) … P(X 2 | X 1 ) · P(X 1 ) = P(X n | Par(X n )) · P(X n-1 | Par(X n-1 )) … P(X 1 )

29
© Daniel S. Weld 29 Inference in BNs The graphical independence representation yields efficient inference schemes We generally want to compute Pr(X), or Pr(X|E) where E is (conjunctive) evidence Computations organized by network topology One simple algorithm: variable elimination (VE)

30
© Daniel S. Weld 30 P(B | J=true, M=true) EarthquakeBurglary Alarm MaryJohn Radio P(b|j,m) = P(b) P(e) P(a|b,e)P(j|a)P(m,a) e a

31
© Daniel S. Weld 31 Structure of Computation Dynamic Programming

32
© Daniel S. Weld 32 Variable Elimination A factor is a function from some set of variables into a specific value: e.g., f(E,A,N1) CPTs are factors, e.g., P(A|E,B) function of A,E,B VE works by eliminating all variables in turn until there is a factor with only query variable To eliminate a variable: join all factors containing that variable (like DB) sum out the influence of the variable on new factor exploits product form of joint distribution

33
© Daniel S. Weld 33 Example of VE: P(N1) Earthqk Burgl Alarm N2N1 P(N1) = N2,A,B,E P(N1,N2,A,B,E) = N2,A,B,E P(N1|A)P(N2|A) P(B)P(A|B,E)P(E) = A P(N1|A) N2 P(N2|A) B P(B) E P(A|B,E)P(E) = A P(N1|A) N2 P(N2|A) B P(B) f1(A,B) = A P(N1|A) N2 P(N2|A) f2(A) = A P(N1|A) f3(A) = f4(N1)

34
© Daniel S. Weld 34 Notes on VE Each operation is a simply multiplication of factors and summing out a variable Complexity determined by size of largest factor e.g., in example, 3 vars (not 5) linear in number of vars, exponential in largest factorelimination ordering greatly impacts factor size optimal elimination orderings: NP-hard heuristics, special structure (e.g., polytrees) Practically, inference is much more tractable using structure of this sort

Similar presentations

Presentation is loading. Please wait....

OK

Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)

Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on power steering Ppt on sources of energy Ppt on forward contract definition Ppt on automobile related topics about economics Ppt on condition monitoring One act play ppt on ipad Ppt on different types of computer softwares for pc Ppt on disk formatting fails Uses of plants for kids ppt on batteries Ppt on information technology and values