Introduction to Bayesian Networks
Joint Probability Distribution Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states in total? How many free parameters do we need? How many table entries should we sum over to get P(R=r)?
Joint Probability Distribution Full joint distribution is sufficient to represent the complete domain and to do any type of probabilistic inferences Problems: n - number of random variables, d – number of values Space complexity. A full joint distribution requires to remember O(dn) numbers Inference complexity. Computing queries requires O(dn) steps Acquisition problem. Who is going to define all the probabilities?
Joint Probability Distribution
Joint Probability Distribution
Joint Probability Distribution VS
Bayesian networks A Bayesian Network is a graph in which: P(E) P(B) A set of random variables makes up the nodes in the network. A set of directed links or arrows connects pairs of nodes. Each node has a conditional probability table that quantifies the effects the parents have on the node. Directed, acyclic graph (DAG), i.e. no directed cycles. P(E) P(B) A Bayesian network modeling the joint probability distribution P(B, E, R, A, N) P(A|B,E) P(R|E) P(N|A)
Bayesian Networks Conditional probability tables (CPT): P(B) P(E) P(A|B,E) P(R|E) P(N|A)
Independences in BNs Three basic independence structures:
Independences in BNs Indirect cause: Burglary is independent of NeighborCall given Alarm P(N|A, B) = P(N|A) P(N, B|A) = P(N|A)P(B|A)
Independences in BNs Common cause: Alarm is independent of RadioAnnounce given Earthquake P(A|E, R) = P(A|E) P(A, R|E) = P(A|E)P(R|E)
Independences in BNs Common effect: Burglary is independent of Earthquake when Alarm is not known Burglary and Earthquake become dependent given Alarm!!!
Path blocking With linear substructure With common cause structure With common effect structure X Y C in Z X Y C in Z X Y C or any of its descendants not in Z
Independences in BNs Earthquake ┴ Burglary | NeighborCall? Burglary ┴ RadioAnnounce | Earthquake? Burglary ┴ RadioAnnounce | NeighborCall?
Markov Assumption Each variable is independent on its non-descendants, given its parents in Bayesian networks N ┴ B, E, R | A R ┴ B, A, N | E …
Full Joint Distribution in BNs ) , ( N R A E B P ) ( , | B P E A R N =
Chain Rule
Types of Inference Tasks Compute the probability or most likely state of a set of query variables given observed values of some other variables Belief updating Most probable explanation (MPE) queries Maximum A Posteriori (MAP) queries
Inference Direction Types of inference by reasoning direction
Examples Diagnostic inferences: from effect to causes. Causal Inferences: from causes to effects. Intercausal Inferences: Mixed Inference:
Influence diagram Add action nodes and utility nodes to Bayesian networks to enable rational decision making Algorithm: For each value of action node compute expected value of utility node given action, evidence Return MEU action
Value of information Idea: compute value of acquiring each possible piece of evidence Can be done directly from influence diagram Example: buying oil drilling rights Two blocks A and B, exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/2 “Consultant” offers accurate survey of A. Fair price? Solution: compute expected value of information = expected value of best action given the information minus expected value of best action without information Survey may say “oil in A” or “no oil in A”, prob. 0.5 each (given!) = [0.5 * value of “buy A” given “oil in A” + 0.5 * value of “buy B” given “no oil in A”] - 0 = (0.5 * k/2) + (0.5 * k/2) - 0 = k/2 Survey Oil Buy U