CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

CHAPTER 3: BAYESIAN DECISION THEORY

Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Use Bayes rule to calculate the probability of the classes  Make rational decision among multiple actions to minimize expected risk  Learning association rules from data

Unobservable variables Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3  Tossing a coin is completely random process, can’t predict the outcome  Only can talk about the probabilities that the outcome of the next toss will be head or tails  If we have access to extra knowledge (exact composition of the coin, initial position, force etc.) the exact outcome of the toss can be predicted

Unobservable Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4  Unobservable variable is the extra knowledge that we don’t have access to  Coin toss: the only observable variables is the outcome of the toss  X=f(z), z is unobservables, x is observables  f is deterministic function

Bernoulli Random Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5  Result of tossing a coin is  {Heads,Tails}  Define a random variable X  {1,0}  p o the probability of heads  P(X = 1) = p o and P(X = 0) = 1 − P(X = 1) = 1 − p o  Assume asked to predict the next toss  If know p o we would predict heads if p o >1/2  Choose more probable case to minimize probability of the error 1- p o

Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6  What if we don’t know P(X)  Want to estimate from given data (sample) X  Realm of statistics  Sample X generated from probability distribution of the observables x t  Want to build an approximator p(x) using sample X  In coin toss example: sample is outcomes of past N tosses and in distribution is characterized by single parameter p o

Parameter Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7

Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8  Credit scoring: two classes – high risk and low risk  Decide on observable information: (income and saving)  Have reasons to believe that these 2 variable gives us idea about the credibility of a customer  Represent by two random variable X 1 and X 2  Can’t observe customer intentions and moral codes  Can observe credibility of a past customer  Bernoulli random variable C conditioned on X=[X 1, X 2 ] T  We know P(C| X 1, X 2 )

Classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9  Assume know P(C| X 1, X 2 )  New applications X 1 =x 1, X 2 =x 2

Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10  Similar to coin toss but C is conditioned on two other observable variables x=[x 1,x 2 ] T  The problem : Calculate P(C|x)  Use Bayes rule

Conditional Probability Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11  Probability of A (point will be inside A) if we know that B happens (point is inside B)  P(A|B)=P(A  B)/P(B)

Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13  Prior: probability of a customer is high risk regardless of x.  Knowledge we have as to the value of C before looking at observables x posterior likelihoodprior evidence

Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14  Likelihood: probability that event in C will have observable X  P(x 1,x 2 |C=1) is the probability that a high-risk customer has his X 1 =x 1,X 2 =x 2 posterior likelihoodprior evidence

Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15  Evidence: P(x) probability that observation x is seen regardless if positive or negative posterior likelihoodprior evidence

Bayes’ Rule 16 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) posterior likelihoodprior evidence

Bayes Rule for classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17  Assume know : prior, evidence and likelihood  Will learn how to estimate them from the data later  Plug them in into Bayes formula to obtain P(C|x)  Choose C=1 if P(C=1|x)>P(c=0|x)

Bayes’ Rule: K>2 Classes 18 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Bayes’ Rule Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19  Deciding on specific input x  P(x) is the same for all classes  Don’t need it to compare posterior

Losses and Risks  Decisions/Errors are not equally good or costly  Actions: α i is assignment to class i  Loss of α i when the state is C k : λ ik  Expected risk (Duda and Hart, 1973) Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20

Losses and Risks: 0/1 Loss 21 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) For minimum risk, choose the most probable class

Losses and Risks: Reject Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22

Discriminant Functions Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 23  Define a function g i (x) for each class ( “goodness” of selecting class C i given observables x  Maximum discriminant corresponds to minimum conditional risk

Decision Regions Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24 K decision regions R 1,...,R K

K=2 Classes  g(x) = g 1 (x) – g 2 (x)  Log odds: Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25

Interaction Between Variables Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 27  Have many variables want to represent and analyze  Structure of dependency among them  Conditional and marginal probabilities  Use graphical representation  Nodes (events) and arcs (dependencies)  Numbers of nodes and arcs (conditional probabilities)

Causes and Bayes’ Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 28 Diagnostic inference: Knowing that the grass is wet, what is the probability that rain is the cause? causal diagnostic

Causal vs Diagnostic Inference Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 30 Diagnostic inference: If the grass is wet, what is the probability that the sprinkler is on? P(S|W) = 0.35 > 0.2 P(S) P(S|R,W) = 0.21 Explaining away: Knowing that it has rained decreases the probability that the sprinkler is on.

Bayesian Nets: Local structure 32 Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)  Only need to specify interactions between neighboring events

Bayesian Networks: Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 33 diagnostic P (C | x ) Bayes’ rule inverts the arc:

Naive Bayes’ Classifier Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 34 Given C, x j are independent: p(x|C) = p(x 1 |C) p(x 2 |C)... p(x d |C)

Naïve Bayes’ Classifier Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 35  Why naïve?  Ignores dependency among the inputs  Dependency among “baby food” and “diapers”  Hidden variables (“have children”)  Insert node/arcs and estimate their values from data

Association Rules  Association rule: X  Y  Support (X  Y):  Confidence (X  Y): Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 36

Association Rule Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 37  Only one customer bought chips  Same customer bought beer  P(C|B)=1  But support is tiny  Support shows statistical significance

CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Similar presentations

Presentation on theme: "CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Similar presentations

Presentation on theme: "CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)"— Presentation transcript:

Similar presentations

About project

Feedback