Presentation is loading. Please wait.

Presentation is loading. Please wait.

5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.

Similar presentations


Presentation on theme: "5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005."— Presentation transcript:

1 5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes

2 5/25/2005EE562 Uncertainty & Bayesian Networks Chapter 13/14

3 5/25/2005EE562 Outline Inference Independence and Bayes' Rule Chapter 14 –Syntax –Semantics –Parameterized Distributions –Inference in Bayesian Networks

4 5/25/2005EE562 On the final Same format as midterm closed book/closed notes Might test on all material of the quarter, including today (i.e., chapters 1-9, 13,14) –but will not test on fuzzy logic. Will be weighted towards latter half of the course though.

5 5/25/2005EE562 Homework Last HW of the quarter Due next Wed, June 1 st, in class: –Chapter 13: 13.3, 13.7, 13.16 –Chapter 14: 14.2, 14.3, 14.10

6 5/25/2005EE562 Bayesian Networks Chapter 14

7 5/25/2005EE562 Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax: –a set of nodes, one per variable –a directed, acyclic graph (link ≈ "directly influences") –a conditional distribution for each node given its parents: P (X i | Parents (X i )) In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the distribution over X i for each combination of parent values

8 5/25/2005EE562 Example contd.

9 5/25/2005EE562 Semantics The full joint distribution is defined as the product of the local conditional distributions: P (X 1, …,X n ) = π i = 1 P (X i | Parents(X i )) e.g., P(j  m  a   b   e) = P (j | a) P (m | a) P (a |  b,  e) P (  b) P (  e) n

10 5/25/2005EE562 Local Semantics Local semantics: each node is conditionally independent of its nondescendants given its parents Thm: Local semantics  global semantics

11 5/25/2005EE562 Example: car diagnosis Initial evidence: car won’t start Testable variables (green), “broken, so fix it” variables (orange) Hidden variables (gray) ensure sparse structure, reduce parameters.

12 5/25/2005EE562 Example: car insurance

13 5/25/2005EE562 compact conditional dists. CPT grows exponentially with number of parents CPT becomes infinite with continuous-valued parent or child Solution: canonical distributions that are defined compactly Deterministic nodes are the simplest case: –X = f(Parents(X)), for some deterministic function f (could be logical form) E.g., boolean functions –NorthAmerican  Canadian Ç US Ç Mexican E.g,. numerical relationships among continuous variables

14 5/25/2005EE562 compact conditional dists. “Noisy-Or” distributions model multiple interacting causes: –1) Parents U 1, …, U k include all possible causes –2) Independent failure probability q i for each cause alone – : X ´ U 1 Æ U 2 Æ … Æ U k –  P(X|U 1, …, U j, : U j+1, …, : U k ) = 1 -  i=1 j q i Number of parameters is linear in number of parents.

15 5/25/2005EE562 Hybrid (discrete+cont) networks Discrete (Subsidy? and Buys?); continuous (Harvest and Cost) Option 1: discretization – large errors and large CPTs Option 2: finitely parameterized canonical families –Gaussians, Logistic Distributions (as used in Neural Networks) Continuous variables, discrete+continuous parents (e.g., Cost) Discrete variables, continuous parents (e.g., Buys?)

16 5/25/2005EE562 Inference by enumeration by variable elimination by stochastic simulation by Markov chain Monte Carlo

17 5/25/2005EE562 Inference Tasks Simple queries: compute posterior marginal, P(X i |E=e) –e.g., P(NoGas|Gague=empty,Lights=on,Starts=false) Conjunctive queries: –P(X i,X j |E=e) = P(X i |E=e)P(X j |X i,E=e) Optimal Decisions: decision networks include utility information; probabilistic inference required fro P(outcome|action,evidence) Value of information: which evidence to seek next? Sensitivity analysis: which probability values are most critical? Explanation: why do I need a new starter motor?

18 5/25/2005EE562 Inference By Enumeration

19 5/25/2005EE562 Enumeration Algorithm

20 5/25/2005EE562 Evaluation Tree

21 5/25/2005EE562 Inference by Variable Elimination

22 5/25/2005EE562 Variable Elimination: Basic operations

23 5/25/2005EE562 Variable Elimination: Algorithm

24 5/25/2005EE562 Irrelevant variables

25 5/25/2005EE562 Irrelevant varaibles continued:

26 5/25/2005EE562 Complexity of exact inference

27 5/25/2005EE562 Inference by stochastic simulation

28 5/25/2005EE562 Sampling from empty network

29 5/25/2005EE562 Example

30 5/25/2005EE562 Rejection Sampling

31 5/25/2005EE562 Analysis of rejection sampling

32 5/25/2005EE562 Likelihood weighting

33 5/25/2005EE562 MCMC

34 5/25/2005EE562 Summary Bayesian networks provide a natural representation for (causally induced) conditional independence Topology + CPTs = compact representation of joint distribution Generally easy for domain experts to construct Exact inference by variable elimination –polytime on polytrees, NP-hard on general graphs –space can be exponential as well –sampling approaches can help, as they only do approximate inference. Take my Graphical Models class if more interested (much more theoretical depth)


Download ppt "5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005."

Similar presentations


Ads by Google