Presentation is loading. Please wait.

Presentation is loading. Please wait.

Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech.

Similar presentations


Presentation on theme: "Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech."— Presentation transcript:

1 Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. http://www.cse.ust.hk/~lzhang

2 Outline l Latent tree models l Latent tree analysis algorithms l What can LTA be used for: n Discovery of co-occurrence/correlation patterns n Discovery of latent variable/structures n Multidimensional clustering l Examples n Danish beer survey data n Text data n TCM survey data Page 2

3 Latent Tree Models l Tree-structured probabilistic graphical models n Leaves observed (manifest variables)  Discrete or continuous n Internal nodes latent (latent variables)  Discrete n Each edge is associated with a conditional distribution n One node with marginal distribution n Defines a joint distributions over all the variables (Zhang, JMLR 2004) Page 3

4 Latent Tree Analysis Learning latent tree models: Determine Number of latent variables Numbers of possible states for latent variables Connections among nodes Probability distributions From data on observed variables, obtain latent tree model Model Selection Criterion Find the model that maximize the BIC score BIC(m|D) = log P(D|m, θ *) – d/2 logN D: Data, N: sample size m: model, θ *: MLE of parameters d: number of free parameters

5 Algorithms: EAST l Search-based Extension, Adjustment, Simplification until Termination l Can deal with ~100 observed variables (Chen, Zhang et al. AIJ 2011) Page 5

6 (Liu, Zhang et al. MLJ 2013) UniDimensioanlity Test

7 (Liu, Zhang et al. MLJ 2013)

8 Chow-Liu tree (1968)

9 Close to EAST in terms of model quality. Can deal with 1,000 observed variables (Liu, Zhang et al. MLJ 2013)

10 Outline l Latent tree models l Latent tree analysis algorithms l What can LTA be used for: n Discovery of co-occurrence/correlation patterns n Discovery of latent variable/structures n Multidimensional clustering l Examples n Danish beer survey data n Text data n TCM survey data Page 10

11 Danish Beer Market Survey l 463 consumers, 11 beer brands l Questionnaire: For each brand: n Never seen the brand before (s0); n Seen before, but never tasted (s1); n Tasted, but do not drink regularly (s2) n Drink regularly (s3). Page 11 (Mourad et al. JAIR 2013)

12 Why variables grouped as such? l GronTuborg and Carlsberg: Main mass-market beers l TuborgClas and CarlSpec: Frequent beers, bit darker than the above l CeresTop, CeresRoyal, Pokal, …: minor local beers l Grouped as such because responses on brands in each group strongly correlated. l Intuitively, latent tree analysis: n Partitions observed variables into groups such that  Variables in each group are strongly correlated, and  The correlations among each group can be properly be modeled using one single latent variable Page 12

13 Multidmensional Clustering l Each Latent variable give s a partition of consumers. n H1:  Class 1: Likely to have tasted TuborgClas, Carlspec and Heineken, but do not drink regularly  Class 2: Likely to have seen or tasted the beers, but did not drink regularly  Class 3: Likely to drink TuborgClas and Carlspec regularly l Intuitively, latent tree analysis is a technique for multiple clustering. n K-Means, mixture models give only one partition. Page 13

14 Binary Text Data: WebKB 1041 web pages collected from 4 CS departments in 1997 336 words Page 14 (Liu et al. PGM 2012, MLJ 2013)

15 Latent Tree Model for WebKB Data by BI Algorithm Page 15 89 latent variables

16 Latent Tree Modes for WebKB Data

17 Page 17

18 Page 18

19 Page 19 l Group as such because words in in each group tend to co-occur. l On binary data, latent tree analysis: n Partitions observed word variables into groups such that  Words in each group tend to co-occur and  The correlations can be properly be explained using one single latent variable Page 19 Why variables grouped as such? LTA is a method for identifying co-occurrence relationships.

20 Multidimensional Clustering LTA is an approach to topic detection l Y66=4: Object Oriented Programming (oop) l Y66=2: Non-oop programming l Y66=1: programming language l Y66=3: Not on programming

21 Outline l Latent tree models l Latent tree analysis algorithms l What can LTA be used for: n Discovery of co-occurrence/correlation patterns n Discovery of latent variable/structures n Multidimensional clustering l Examples n Danish beer survey data n Text data n TCM survey data Page 21

22 Background of Research l Common practice in China, increasingly in Western world n Patients of a WM disease divided into several TCM classes n Different classes are treated differently using TCM treatments. l Example: n WM disease: Depression n TCM Classes:  Liver-Qi Stagnation ( 肝气郁结 ). Treatment principle: 疏肝解郁, Prescription: 柴胡疏肝散  Deficiency of Liver Yin and Kidney Yin ( 肝肾阴虚 ) : Treatment principle: 滋肾养肝, Prescription: 逍遥散合六味地黄丸  Vacuity of both heart and spleen ( 心脾两虚 ). Treatment principle: 益 气健脾, Prescription: 归脾汤  …. Page 22

23 Key Question l How should patients of a WM disease be divided into subclasses from the TCM perspective? n What TCM classes? n What are the characteristics of each TCM class? n How to differentiate different TCM classes? l Important for n Clinic practice n Research  Randomized controlled trials for efficacy  Modern biomedical understanding of TCM concepts l No consensus. Different doctors/researchers use different schemes. Key weakness of TCM. Page 23

24 Key Idea l Our objective: n Provide an evidence-based method for TCM patient classification l Key Idea n Cluster analysis of symptom data => empirical partition of patients n Check to see whether it corresponds to TCM class concept l Key technology: Multidimensional clustering n Motivation for developing latent tree analysis Page 24

25 Symptoms Data of Depressive Patients l Subjects: n 604 depressive patients aged between 19 and 69 from 9 hospitals n Selected using the Chinese classification of mental disorder clinic guideline CCMD-3 n Exclusion:  Subjects we took anti-depression drugs within two weeks prior to the survey; women in the gestational and suckling periods,.. etc l Symptom variables n From the TCM literature on depression between 1994 and 2004. n Searched with the phrase “ 抑郁 and 证 ” on the CNKI (China National Knowledge Infrastructure) data n Kept only those on studies where patients were selected using the ICD-9, ICD-10, CCMD-2, or CCMD-3 guidelines. n 143 symptoms reported in those studies altogether. Page 25 (Zhao et al. JACM 2014)

26 The Depression Data l Data as a table n 604 rows, each for a patient n 143 columns, each for a symptom n Table cells: 0 – symptom not present, 1 – symptom present l Removed: Symptoms occurring <10 times l 86 symptoms variables entered latent tree analysis. l Structure of the latent tree model obtained on the next two slides. Page 26

27 Model Obtained for a Depression Data (Top) Page 27

28 Model obtained for a Depression Data (Bottom) Page 28

29 The Empirical Partitions Page 29 l The first cluster (Y 29 = s 0 ) consists of 54% of the patients and while the cluster (Y 29 = s 1 ) consists of 46% of the patients. l The two symptoms ‘fear of cold’ and ‘cold limbs’ do not occur often in the first cluster l While they both tend to occur with high probabilities (0.8 and 0.85) in the second cluster.

30 Probabilistic Symptom co-occurrence pattern l Probabilistic symptom co-occurrence pattern: n The table indicates that the two symptoms ‘fear of cold’ and ‘cold limbs’ tend to co-occur in the cluster Y 29 = s 1 l Pattern meaningful from the TCM perspective. n TCM asserts that YANG DEFICIENCY ( 阳虚 ) can lead to, among other symptoms, ‘fear of cold’ and ‘cold limbs’ n So, the co-occurrence pattern suggests the TCM symdrome type (证型) YANG DEFICIENCY ( 阳虚 ). Page 30 l The partition Y 29 suggests that n Among depressive patients, there is a subclass of patient with YANG DEFICIENCY. n In this subclass, ‘fear of cold’ and ‘cold limbs’ co-occur with high probabilities (0.8 and 0.85)

31 Probabilistic Symptom co-occurrence pattern Page 31 l Y 28 = s 1 captures the probabilistic co-occurrence of ‘aching lumbus’, ‘lumbar pain like pressure’ and ‘lumbar pain like warmth’. l This pattern is present in 27% of the patients. l It suggests that n Among depressive patients, there is a subclass that correspond to the TCM concept of KIDNEY DEPRIVED OF NOURISHMENT ( 肾虚失养 ) n Characteristics of the subclass given by distributions for Y 28 = s 1

32 Probabilistic Symptom co-occurrence pattern Page 32 l Y 27 = s 1 captures the probabilistic co-occurrence of ‘weak lumbus and knees’ and ‘cumbersome limbs’. l This pattern is present in 44% of the patients l It suggests that, n Among depressive patients, there is a subclass that correspond to the TCM concept of KIDNEY DEFICIENCY (肾虚) n Characteristics of the subclass given by distributions for Y 27 = s 1 l Y27, Y28, Y29 together provide evidence for defining KIDNEY YANG DEFICIENCY

33 Probabilistic Symptom co-occurrence pattern l Pattern Y 21 = s 1 : evidence for defining STAGNANT QI TURNING INTO FIRE (气郁化火) l Y 15 = s 1 : evidence for defining QI DEFICIENCY l Y 17 = s 1 : evidence for defining HEART QI DEFICIENCY l Y 16 = s 1 : evidence for defining QI STAGNATION l Y 19 = s 1 : evidence for defining QI STAGNATION IN HEAD Page 33

34 Probabilistic Symptom co-occurrence pattern l Y 9 = s 1 :evidence for defining DEFICIENCY OF BOTH QI AND YIN ( 气阴两虚 ) l Y 10 = s 1 : evidence for defining YIN DEFICIENCY ( 阴虚 ) l Y 11 = s 1 : evidence for defining DEFICIENCY OF STOMACH/SPLEEN YIN ( 脾 胃阴虚 ) Page 34

35 Symptom Mutual-Exclusion Patterns l Some empirical partitions reveal symptom exclusion patterns l Y 1 reveals the mutual exclusion of ‘white tongue coating’, ‘yellow tongue coating’ and ‘yellow-white tongue coating’ l Y 2 reveals the mutual exclusion of ‘thin tongue coating’, ‘thick tongue coating’ and ‘little tongue coating’. Page 35

36 Summary of TCM Data Analysis l By analyzing 604 cases of depressive patient data using latent tree models we have discovered a host of probabilistic symptom co-occurrence patterns and symptom mutual-exclusion patterns. l Most of the co-occurrence patterns have clear TCM syndrome connotations, while the mutual-exclusion patterns are also reasonable and meaningful. l The patterns can be used as evidence for the task of defining TCM classes in the context of depressive patients and for differentiating between those classes. Page 36

37 Another Perspective: Statistical Validation of TCM Postulates Page 37 (Zhang et al. JACM 2008) Yang Deficiency Y29 = s1 Kidney deprived of nourishment Y28 = s1 l TCM terms such as Yang Deficiency were introduced to explain symptom co- occurrence patterns observed in clinic practice. …..

38 Value of Work in View of Others l D. Haughton and J. Haughton. Living Standards Analytics: Development through the Lens of Household Survey Data. Springer. 2012 n Zhang et al. provide a very interesting application of latent class (tree) models to diagnoses in traditional Chinese medicine (TCM). n The results tend to confirm known theories in Chinese traditional medicine. n This is a significant advance, since the scientific bases for these theories are not known. n The model proposed by the authors provides at least a statistical justification for them. Page 38

39 Summary l Latent tree models: n Tree-structure probabilistic graphical models n Leaf nodes: observed variables n Internal nodes: latent variable l What can LTA be used for: n Discovery of co-occurrence patterns in binary data n Discovery of correlation patterns in general discrete data n Discovery of latent variable/structures n Multidimensional clustering n Topic detection in text data n Key role in TCM patient classification Page 39

40 References: l N. L. Zhang (2004). Hierarchical latent class models for cluster analysis. Journal of Machine Learning Research, 5(6):697- 723, 2004. l T. Chen, N. L. Zhang, T. F. Liu, Y. Wang, L. K. M. Poon (2011). Model-based multidimensional clustering of categorical data. Artificial Intelligence, 176(1), 2246-2269. l T.F.Liu, N. L. Zhang, A.H. Liu, L.K.M. Poon (2012). A Novel LTM-based Method for Multidimensional Clustering. European Workshop on Probabilistic Graphical Models (PGM-12), 203-210. l T.F, Liu, N. L. Zhang, P. X. Chen, A. H.Liu, L. K. M. Poon, and Yi Wang (2013). Greedy learning of latent tree models for multidimensional clustering. Machine Learning, doi:10.1007/s10994-013-5393-0. l R. Mourad, C. Sinoquet, N. L. Zhang, T.F. Liu and P. Leray (2013). A survey on latent tree models and applications. Journal of Artificial Intelligence Research, 47, 157-203, 13 May 2013. doi:10.1613/jair.3879. l N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Statistical Validation of TCM Theories. Journal of Alternative and Complementary Medicine, 14(5):583-7. l N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Latent tree models and diagnosis in traditional Chinese medicine. Artificial Intelligence in Medicine. 42: 229-245. l Z.X. Xu, N. L. Zhang, Y.Q. Wang, G.P. Liu, J. Xu, T. F. Liu, and A. H. Liu (2013). Statistical Validation of Traditional Chinese Medicine Syndrome Postulates in the Context of Patients with Cardiovascular Disease. The Journal of Alternative and Complementary Medicine. l Y. Zhao, N. L. Zhang, T.F.Wang, Q. G. Wang (2014). Discovering Symptom Co-Occurrence Patterns from 604 Cases of Depressive Patient Data using Latent Tree Models. The Journal of Alternative and Complementary Medicine. Page 40

41 Thank You !


Download ppt "Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech."

Similar presentations


Ads by Google