Naïve Bayesian Classifiers Before getting to Naïve Bayesian Classifiers let’s first go over some basic probability theory p(C k |A) is known as a conditional probability of event C happening given that event A has occurred. We can express the conditional probability, p(C k |A) as follows: –p(C k |A) = p(C k ∩A)/p(A), or –p(C k |A) = (# of times C k & A occur) (# of times A occurs)
Naïve Bayesian Classifiers What is p(PlayTennis = Yes | Outlook = Rain)?
Naïve Bayesian Classifiers What is p(Humidity = High | Wind = Weak)?
Naïve Bayesian Classifiers What is p(PlayTennis = Yes | Temp = Hot, Humidity = High)? How many conditional probabilities exist in this dataset?
Naïve Bayesian Classifiers Given p(C k |A) = p(C k ∩A)/p(A), we know that p(A|C k ) = p(A∩C k )/p(C k ), and p(A∩C k ) = p(A|C k ) p(C k ). Now, since p(A∩C k ) = p(C k ∩A), – p(C|A) = p(A|C) p(C) P(A) This is known as the Bayesian Rule
Naïve Bayesian Classifiers Some other useful equations are: –p(A) = Σ i p(A∩B i ), –p(A) = Σ i p(A|B i )p(B i )
Naïve Bayesian Classifiers What is Σ i p(PlayTennis = Yes ∩ Humidity i )? = p(PlayTennis = Yes ∩ Humidity = High) + p(PlayTennis = Yes ∩ Humidity = Normal)
Naïve Bayesian Classifiers Given that (Outlook = Sunny, Humidity = Normal) should we play tennis or not? We can express this as: –p(PlayTennis = Yes | Outlook = Sunny, Humidity = Normal) = p(Outlook = Sunny, Humidity = Normal|PlayTennis = Yes) p(PlayTennis = Yes) p(Outlook = Sunny, Humidity = Normal) A general equation for this is: p(C i |A 1 A 2 …A n ) = p(A 1 A 2 …A n |C i ) p(C i ) Σ k p(A 1 A 2 …A n |C k ) p(C k ) However, the conditional probability, p(A 1 A 2 …A n |C i ), may be difficult to compute.
Naïve Bayesian Classifiers However, if we assume conditional independence among the attributes of our query we have the following: p(C i |A 1 A 2 …A n ) = p(A 1 |C i )p(A 2 |C i )…p(A n |C i ) p(C i ) Σ k p(A 1 |C k )p(A 2 |C k )…p(A n |C k ) p(C k )
Naïve Bayesian Classification Naïve Bayesian Classification: –Result = argmax C k [Π p(A i |C k )]p(C k ), –Where p(A i |C k ) = (# of A i ∩ C k ) (# of C k )
Naïve Bayesian Classification Classify: (Outlook = Sunny, Humidity = Normal) Result yes = p(Outlook = Sunny ∩ PlayTennis = Yes)/p(PlayTennis = Yes) * p(Humidity = Normal ∩ PlayTennis = Yes)/p(PlayTennis = Yes) * p(PlayTennis = Yes) Result no = p(Outlook = Sunny ∩ PlayTennis = No)/p(PlayTennis = No)* p(Humidity = Normal ∩ PlayTennis = No)/p(PlayTennis = No) * p(PlayTennis = No)
Naïve Bayesian Classifiers (Continuous) How would we develop an NBC for this problem?