Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
The Bayesian approach #1 Question What is Bayesian probability? A person’s degree of belief in certain event. Personal (subjective) Your degree of belief that the coin will land heads.
The Classical approach Physical property of the world. Repeated trials (frequency) The probability that a coin will land heads.
#2 Question What are the advantages and disadvantages of the Bayesian and classical interpretation of probability? Bayesian probability: + Reflects an expert’s knowledge. + Compiles with rules of probability -Arbitrary Classical probability: + Objective, unbiased. - Not available in most situations.
Bayes Theorem Posterior = (likelihood X prior) / evidence
Bayesian Networks Graphical model that encodes the joint probability distribution (JPD) for a set of variables X. It is a directed acyclic (not cyclic) graph. Each node represents one variable and contains a set local probability distributions (LPD) associated with each variable.
Bayesian Networks Nodes –Parents –Children Conditional probability tables Construction
Inference The computation of a probability of interest given a model is known as probabilistic inference P(X|e)=P(x,e)/P(e) = cP(X,e) Example on board.
Learning Learning from data –Refine the structure and LPD of a BN –Combine prior knowledge with data Result: IMPROVED KNOWLEDGE
Question #3 Mention at least 3 advantages of Bayesian Networks for data analysis. Explain each one. Handle incomplete data sets Learning about causal relationships Combine domain knowledge + data Avoid over fitting.