Probabilistic Generative Models Rong Jin
Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class C k
Probabilistic Generative Model Classification decision Key is to decide parameters
Probabilistic Generative Model Given training data
Probabilistic Generative Model
Singularity of covariance matrix Overfitting problem Solutions Diagonalize the covariance matrix Smoothing/regularization
Naïve Bayes Difficult to estimate for high dimensional data x Naïve Bayes approximation Distribution of 1 D Diagonalize the covariance matrix
Naïve Bayes Text categorization : word histogram of a document
Naïve Bayes Bad approximation Good classification accuracy Text categorization for 20 Newsgroups
Naïve Bayes It is the ratio that matters
Decision Boundary Consider text categorization of two classes Linear decision boundary
Decision Boundary Consider two class classification Gaussian density function Shared covariance matrix Linear decision boundary
Decision Boundary Generative models essentially create linear decision boundaries Why not directly model the linear decision boundary
Assumption of Generative Models It misses the factor How important is ?
Ambiguous Training Data Training data : training data only indicates the set of class labels to which the true class assignment belongs to