CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.

CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks

Joint Probability

Marginal Probability

Conditional Probability

The Chain Rule I

Bayes’ Rule

More Bayes’ Rule

The Chain Rule II

Independence

Example: Independence

Example: Independence?

Conditional Independence

The Chain Rule III

Expectations

Estimation

Estimation Problems with maximum likelihood estimates: - If I flip a coin once, and it’s heads, what’s the estimate for P(heads)? - What if I flip it 50 times with 27 heads? - What if I flip 10M times with 8M heads? Basic idea: - We have some prior expectation about parameters (here, the probability of heads) - Given little evidence, we should skew toward prior - Given lots of evidence, we should listen to data How can we accomplish this? Stay tuned!

Lewis Carroll's Pillow Problem

Bayesian Networks: Big Picture Two big problems with joint probability distributions: - Unless there are only a few variables, the distribution is too big to represent explicitly (Why?) - Hard to estimate anything empirically about more than a few variables at a time (Why?) Hard to compute answers to queries of the form P(y | a) (Why?) Bayesian networks are a technique for describing complex joint distributions (models) using a bunch of simple, local distributions - It describes how variables interact locally - Local interactions chain together to give global, indirect interactions - For about 10 min, we’ll be very vague about how these interactions are specified

Graphical Model Notation

Example: Coin Flips

Example: Traffic

Example: Traffic II

Example: Alarm Network

Bayesian Network Semantics

Example: Alarm Network

Size of a Bayes’ Net How big is a joint distribution over N Boolean variables? 2N How big is a Bayes net if each node has k parents? N 2k Both give you the power to calculate P(X1,X2,…,Xn) Bayesian Networks = Huge space savings! Also easier to elicit local CPTs Also turns out to be faster to answer queries (future class)

Building the (Entire) Joint

Example: Traffic

Example: Reverse Traffic

Causality? When Bayes’ nets reflect the true causal patterns: Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from experts BNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and RoofDrips End up with arrows that reflect correlation, not causation What do the arrows really mean? Topology may happen to encode causal structure Topology really encodes conditional independencies

Creating Bayes’ Nets So far, we talked about how any fixed Bayes’ net encodes a joint distribution Next: how to represent a fixed distribution as a Bayes’ net Key ingredient: conditional independence The exercise we did in “causal” assembly of BNs was a kind of intuitive use of conditional independence Now we have to formalize the process After that: how to answer queries (inference)

Conditional Independence

CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.

Similar presentations

Presentation on theme: "CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.

Similar presentations

Presentation on theme: "CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks."— Presentation transcript:

Similar presentations

About project

Feedback