Latent variable discovery in classification models

Latent variable discovery in classification models
Nevin L. Zhang, Thomas D. Nielsen, Finn V. Jensen Artificial Intelligence in Medicine 30 (2004) 283–299 Advisor : Professor Chung-Chian Hsu Reporter : Wen-Chung Liao 2006/7/12

Outline Motivation Objectives HNB models Learning HNB models Results
Concluding Remarks Personal Comments

Motivation The naive Bayes model makes the often unrealistic assumption that the feature variables are mutually independent given the class variable. Latent variable discovery is especially interesting in medical applications

Objectives Show how latent variables can be detected.

HNB models Theorem 1. Parsimonious HNB models are regular.
Hierarchical naive Bayes (HNB) model Tree-shaped M = (m, θ) |Z| Parsimonious models Regular models For a latent variable Z in an HNB model, enumerate its neighbors (parent and children) as Z1, Z2, … , Zk. Theorem 1. Parsimonious HNB models are regular. Theorem 2. The set of all regular HNB models for a given set of class and feature variables is finite. Lemma 1. In a regular HNB model, no two singly connected latent nodes can be neighbors. the Bayesian information criterion, BIC(m|D), denotes the BIC score of a model m given a data set D, then Z2 Z1

Learning HNB models Hill-climbing algorithm A natural search space
a search space search operators. A natural search space the set of all regular HNB models for a given set of class and feature variables. Restructure the space into two levels: 1. Given a model structure, find an optimal cardinality for the latent variables. 2. Find an optimal model structure.

Learning HNB models Learning cardinality Learning model structures
To search this space, start with the naive Bayes model structure. At each step, modify the current model to construct a number of new model structures. The new structures are then evaluated the best structure is selected to seed the next search step. Three operators parent-introduction, parent-alteration node-deletion Theorem 3. Starting from the naive Bayes model structure, we can reach any regular HNB model structure using parent-introduction and parent-alteration.

Results: synthetic data
1. Can our algorithm discover interesting latent variables? 2. Can our algorithm yield better classifiers than the naive Bayes classifier? Three experiments synthetic data sampled from HNB models. All variables have three states. the strength of correlation between the observed variables and latent variables The strength of correlation among observed variables In each experiment, five training sets Result In all cases, our algorithm recovered the structures of the original models precisely. it correctly detected the latent variables.

a test set of 5000 samples For each of the 5000 records in the test set, we computed the posterior distribution of the class variable given values of the feature variables. The KL-divergence between the distribution in the generative model and that in the learned model was calculated.

Results: Monk data sets
Monk’s problems Monk-1, Monk-2, and Monk-3 one binary class variable six features variables a1, …, a6 that have 2-4 possible states. Each problem has a data set of 432 records. Between 30 and 40% of the records were used for training and all records were used in testing. Increase record counts by 3, 9, and 27 times Fig. 7. Structures of HNB models constructed for Monk-1. They match the target concepts nicely. Fig. 9. Structures of HNB models constructed for Monk-3. They match the target concepts nicely.

Results: Other UCI data sets
the Wisconsin Breast Cancer data set the Pima Indians Diabetes data set. Those data sets consist of more than 500 records and do not have fewer than 10 feature variables. For the breast cancer data set, features ‘‘uniformity-of-cell-size’’ and ‘‘uniformity-of-cell-shape’’ always share the same (latent) parent. For the diabetes data set, there is always a latent node that is the parent of both ‘‘age’’ and ‘‘number-of-pregnancies’’.

Concluding remarks HNB models as a framework for detecting latent variables in naive Bayes models. A hill-climbing algorithm for inducing HNB models from data has been developed. A major drawback of the algorithm, its high complexity. At each step of the search it generates a set of new candidate models based on the current model, estimates parameters for each of the candidate models using EM, scores them, and pick the best one to seed the next step of search. A way to overcome this difficulty might be to use structural EM .

Personal Comments Applications Advantages Drawbacks
Business, medicine, … Advantages Find out many properties of HNB models Drawbacks The search space is still large The algorithm is high complexity The algorithm is ambiguous Improve the algorithm by using GA by finding some properties that can partition the search space .

Latent variable discovery in classification models

Similar presentations

Presentation on theme: "Latent variable discovery in classification models"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Latent variable discovery in classification models

Similar presentations

Presentation on theme: "Latent variable discovery in classification models"— Presentation transcript:

Similar presentations

About project

Feedback