We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byKaiden Sill
Modified about 1 year ago
Bayesian Networks CSE 473
© Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence & conditional independence Bayes’ rule Bayesian networks Statistical learning Dynamic Bayesian networks (DBNs) Markov decision processes (MDPs)
© Daniel S. Weld 3 Axioms of Probability Theory All probabilities between 0 and 1 0 ≤ P(A) ≤ 1 P(true) = 1 P(false) = 0. The probability of disjunction is: A B A B True
© Daniel S. Weld 4 Conditional Probability P(A | B) is the probability of A given B Assumes that B is the only info known. Defined by: A B ABAB True
© Daniel S. Weld 5 Inference by Enumeration P(toothache)=.108+.012+.016+.064 =.20 or 20%
© Daniel S. Weld 6 Inference by Enumeration
© Daniel S. Weld 7 Problems ?? Worst case time: O(n d ) Where d = max arity And n = number of random variables Space complexity also O(n d ) Size of joint distribution How get O(n d ) entries for table?? Value of cavity & catch irrelevant - When computing P(toothache)
© Daniel S. Weld 8 Independence A and B are independent iff: These two constraints are logically equivalent Therefore, if A and B are independent:
© Daniel S. Weld 9 Independence True B AA B
© Daniel S. Weld 10 Independence Complete independence is powerful but rare What to do if it doesn’t hold?
© Daniel S. Weld 11 Conditional Independence True B AA B A&B not independent, since P(A|B) < P(A)
© Daniel S. Weld 12 Conditional Independence True B AA B C B C ACAC But: A&B are made independent by C P(A| C) = P(A|B, C)
© Daniel S. Weld 13 Conditional Independence Instead of 7 entries, only need 5
© Daniel S. Weld 14 Conditional Independence II P(catch | toothache, cavity) = P(catch | cavity) P(catch | toothache, cavity) = P(catch | cavity) Why only 5 entries in table?
© Daniel S. Weld 15 Power of Cond. Independence Often, using conditional independence reduces the storage complexity of the joint distribution from exponential to linear!! Conditional independence is the most basic & robust form of knowledge about uncertain environments.
© Daniel S. Weld 16 Bayes Rule Simple proof from def of conditional probability: QED: (Def. cond. prob.) (Mult by P(H) in line 1) (Substitute #3 in #2)
© Daniel S. Weld 17 Use to Compute Diagnostic Probability from Causal Probability E.g. let M be meningitis, S be stiff neck P(M) = 0.0001, P(S) = 0.1, P(S|M)= 0.8 P(M|S) =
© Daniel S. Weld 18 Bayes’ Rule & Cond. Independence
© Daniel S. Weld 19 Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide a graphical representation of conditional independence relations in P usually quite compact requires assessment of fewer parameters, those being quite natural (e.g., causal) efficient (usually) inference: query answering and belief update
© Daniel S. Weld 20 Independence (in the extreme) If X 1, X 2,... X n are mutually independent, then P(X 1, X 2,... X n ) = P(X 1 )P(X 2 )... P(X n ) Joint can be specified with n parameters cf. the usual 2 n -1 parameters required While extreme independence is unusual, Conditional independence is common BNs exploit this conditional independence
© Daniel S. Weld 21 An Example Bayes Net Earthquake BurglaryAlarm Nbr2CallsNbr1Calls Pr(B=t) Pr(B=f) 0.05 0.95 Pr(A|E,B) e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Radio
© Daniel S. Weld 22 Earthquake Example (con’t) If I know if Alarm, no other evidence influences my degree of belief in Nbr1Calls P(N1|N2,A,E,B) = P(N1|A) also: P(N2|N2,A,E,B) = P(N2|A) and P(E|B) = P(E) By the chain rule we have P(N1,N2,A,E,B) = P(N1|N2,A,E,B) ·P(N2|A,E,B)· P(A|E,B) ·P(E|B) ·P(B) = P(N1|A) ·P(N2|A) ·P(A|B,E) ·P(E) ·P(B) Full joint requires only 10 parameters (cf. 32) Earthquake Burglary Alarm Nbr2CallsNbr1Calls Radio
© Daniel S. Weld 23 BNs: Qualitative Structure Graphical structure of BN reflects conditional independence among variables Each variable X is a node in the DAG Edges denote direct probabilistic influence usually interpreted causally parents of X are denoted Par(X) X is conditionally independent of all nondescendents given its parents Graphical test exists for more general independence “Markov Blanket”
© Daniel S. Weld 24 Given Parents, X is Independent of Non-Descendants
© Daniel S. Weld 25 For Example EarthquakeBurglary Alarm Nbr2CallsNbr1Calls Radio
© Daniel S. Weld 26 Given Markov Blanket, X is Independent of All Other Nodes MB(X) = Par(X) Childs(X) Par(Childs(X))
© Daniel S. Weld 27 Conditional Probability Tables Earthquake BurglaryAlarm Nbr2CallsNbr1Calls Pr(B=t) Pr(B=f) 0.05 0.95 Pr(A|E,B) e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Radio
© Daniel S. Weld 28 Conditional Probability Tables For complete spec. of joint dist., quantify BN For each variable X, specify CPT: P(X | Par(X)) number of params locally exponential in |Par(X)| If X 1, X 2,... X n is any topological sort of the network, then we are assured: P(X n,X n-1,...X 1 ) = P(X n | X n-1,...X 1 ) · P(X n-1 | X n-2,… X 1 ) … P(X 2 | X 1 ) · P(X 1 ) = P(X n | Par(X n )) · P(X n-1 | Par(X n-1 )) … P(X 1 )
© Daniel S. Weld 29 Inference in BNs The graphical independence representation yields efficient inference schemes We generally want to compute Pr(X), or Pr(X|E) where E is (conjunctive) evidence Computations organized by network topology One simple algorithm: variable elimination (VE)
© Daniel S. Weld 30 P(B | J=true, M=true) EarthquakeBurglary Alarm MaryJohn Radio P(b|j,m) = P(b) P(e) P(a|b,e)P(j|a)P(m,a) e a
© Daniel S. Weld 31 Structure of Computation Dynamic Programming
© Daniel S. Weld 32 Variable Elimination A factor is a function from some set of variables into a specific value: e.g., f(E,A,N1) CPTs are factors, e.g., P(A|E,B) function of A,E,B VE works by eliminating all variables in turn until there is a factor with only query variable To eliminate a variable: join all factors containing that variable (like DB) sum out the influence of the variable on new factor exploits product form of joint distribution
© Daniel S. Weld 33 Example of VE: P(N1) Earthqk Burgl Alarm N2N1 P(N1) = N2,A,B,E P(N1,N2,A,B,E) = N2,A,B,E P(N1|A)P(N2|A) P(B)P(A|B,E)P(E) = A P(N1|A) N2 P(N2|A) B P(B) E P(A|B,E)P(E) = A P(N1|A) N2 P(N2|A) B P(B) f1(A,B) = A P(N1|A) N2 P(N2|A) f2(A) = A P(N1|A) f3(A) = f4(N1)
© Daniel S. Weld 34 Notes on VE Each operation is a simply multiplication of factors and summing out a variable Complexity determined by size of largest factor e.g., in example, 3 vars (not 5) linear in number of vars, exponential in largest factorelimination ordering greatly impacts factor size optimal elimination orderings: NP-hard heuristics, special structure (e.g., polytrees) Practically, inference is much more tractable using structure of this sort
Bayesian Networks CSE 473. © D. Weld and D. Fox 2 Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Machine Learning II Ensembles & Cotraining CSE 573 Representing Uncertainty.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.
© Daniel S. Weld 1 Statistical Learning CSE 573 Lecture 16 slides which overlap fix several errors.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Artificial Intelligence Universitatea Politehnica Bucuresti Adina Magda Florea
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Probabilistic Reasoning (Mostly using Bayesian Networks)
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Bayesian Belief Networks Kunstmatige Intelligentie / RuG AIMA, Chapter 14 KI2 - 4.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
1 CMSC 471 Fall 2002 Class #19 – Monday, November 4.
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics.
3/19. Conditional Independence Assertions We write X || Y | Z to say that the set of variables X is conditionally independent of the set of variables.
Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability (road state, other.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
BAYESIAN NETWORKS CHAPTER#4 Book: Modeling and Reasoning with Bayesian Networks Author : Adnan Darwiche Publisher: CambridgeUniversity Press 2009.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 15, 5/25/2005 University of Washington, Department of Electrical Engineering Spring.
Baye’s Rule. Baye’s Rule and Reasoning Allows use of uncertain causal knowledge Knowledge: given a cause what is the likelihood of seeing particular effects.
1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Bayesian networks Chapter 14. Outline Syntax Semantics.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
Ai in game programming it university of copenhagen Welcome to... the Crash Course Probability Theory Marco Loog.
CPSC 422 Review Of Probability Theory. Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems:
Identifying Conditional Independencies in Bayes Nets Lecture 4.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
1 Chapter 13 Uncertainty. 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
University College Cork (Ireland) Department of Civil and Environmental Engineering Course: Engineering Artificial Intelligence Dr. Radu Marinescu Lecture.
Advanced Artificial Intelligence Lecture 4B: Bayes Networks.
Bayesian Networks Material used –Halpern: Reasoning about Uncertainty. Chapter 4 –Stuart Russell and Peter Norvig: Artificial Intelligence: A Modern Approach.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
Inference Algorithms for Bayes Networks. Outline Bayes Nets are popular representations in AI, and researchers have developed many inference techniques.
© 2017 SlidePlayer.com Inc. All rights reserved.