Presentation is loading. Please wait.

Presentation is loading. Please wait.

Consistent and Efficient Reconstruction of Latent Tree Models

Similar presentations


Presentation on theme: "Consistent and Efficient Reconstruction of Latent Tree Models"— Presentation transcript:

1 Consistent and Efficient Reconstruction of Latent Tree Models
Stochastic Systems Group Consistent and Efficient Reconstruction of Latent Tree Models Myung Jin Choi Joint work with Vincent Tan, Anima Anandkumar, and Alan S. Willsky Laboratory for Information and Decision Systems Massachusetts Institute of Technology September 29, 2010

2 Latent Tree Graphical Models
Disney CVS Microsoft Apple Dell

3 Latent Tree Graphical Models
Market Computer Disney CVS Microsoft Computer Equipment Apple Dell

4

5 Outline Reconstruction of a latent tree
Algorithm 1: Recursive Grouping Algorithm 2: CLGrouping Experimental results

6 Reconstruction of a Latent Tree
Gaussian model: each node – a scalar Gaussian variable Discrete model: each node – a discrete variable with K states Reconstruct a latent tree using samples of the observed nodes.

7 Minimal Latent Trees (Pearl, 1988)
Conditions for Minimal Latent Trees Each hidden node should have at least three neighbors. Any two variables are neither perfectly dependent nor independent.

8 Desired Properties for Algorithms
Consistent for minimal latent trees Correct recovery given exact distributions. Computationally efficient Low sample complexity Good empirical performance

9 Desired Properties for Algorithms
Consistent for minimal latent trees Correct recovery given exact distributions. Computationally efficient Low sample complexity Good empirical performance

10 Related Work EM-based approaches Phylogenetic trees
ZhangKocka04, HarmelingWilliams10, ElidanFriedman05 No consistency guarantees Computationally expensive Phylogenetic trees Neighbor-joining (NJ) method (SaitouNei87)

11 Information Distance Gaussian distributions

12 Information Distance Gaussian distributions Discrete distributions ex)
Joint probability matrix Marginal probability matrix

13 Information Distance Gaussian distributions Discrete distributions
Algorithms use information distances of observed variables. Assume first that the exact information distances are given. Joint probability matrix Marginal probability matrix

14 Additivity of Information Distances on Trees

15 Testing Node Relationships
Node j – a leaf node Node i – parent of j for all k ≠ i, j . Can identify (parent, leaf child) pair

16 Testing Node Relationships
Node i and j – leaf nodes and share the same parent (sibling nodes) for all k ≠ i, j . Can identify leaf-sibling pairs.

17 Recursive Grouping Step 1. Compute for all observed nodes (i, j, k).

18 Recursive Grouping Step 2. Identify (parent, leaf child) or (leaf siblings) pairs.

19 Recursive Grouping Step 3. Introduce a hidden parent node for each sibling group without a parent.

20 Recursive Grouping Step 4. Compute the information distance for new hidden nodes. e.g.)

21 Recursive Grouping Step 5. Remove the identified child nodes and repeat Steps 2-4.

22 Recursive Grouping Step 5. Remove the identified child nodes and repeat Steps 2-4.

23 Recursive Grouping Step 5. Remove the identified child nodes and repeat Steps 2-4.

24 Recursive Grouping Identifies a group of family nodes at each step.
Introduces hidden nodes recursively. Correctly recovers all minimal latent trees. Computational complexity O(diam(T) m3). Worst case O(m4)

25 CLGrouping Algorithm

26 Chow-Liu Tree Minimum spanning tree of V using D as edge weights
V = set of observed nodes D = information distances

27 Chow-Liu Tree Minimum spanning tree of V using D as edge weights
V = set of observed nodes D = information distances Computational complexity O(m2 log m) For Gaussian models, MST(V; D) = Chow-Liu tree (minimizes KL-divergence to the distribution given by D).

28 Surrogate Node V = set of observed nodes Surrogate node of i
Talk about red arrows!

29 Property of the Chow-Liu Tree

30 CLGrouping Algorithm Step 1. Using information distances of observed nodes, construct the Chow-Liu tree, MST(V; D). Identify the set of internal nodes {3, 5}.

31 CLGrouping Algorithm Step 2. Select an internal node and its neighbors, and apply the recursive-grouping (RG) algorithm.

32 CLGrouping Algorithm You can see that this procedure recovered hidden variables h1 and h2 whose surrogate node is 5, the internal node that we selected. It can be shown that this holds in general. Whenever we apply recursive grouping to an internal node and its neighbors, we recover the hidden nodes that have the internal node as a surrogate node. Step 3. Replace the output of RG with the sub-tree spanning the neighborhood.

33 CLGrouping Algorithm Repeat Steps 2-3 until all internal nodes are operated on.

34 CLGrouping Step 1: Constructs the Chow-Liu tree, MST(V; D).
Step 2: For each internal node and its neighbors, applies latent-tree-learning subroutines (RG or NJ). Correctly recovers all minimal latent trees. Computational complexity O(m2 log m + (#internal nodes) (maximum degree)3). O(m2 log m)

35 Sample-based Algorithms
Compute the ML estimates of information distances. Relaxed constraints for testing node relationships. Consistent. More details in the paper

36 Experimental Results Simulations using Synthetic Datasets
Compares RG, NJ, CLRG, and CLNJ. Robinson-Foulds Metric and KL-divergence.

37

38

39

40 Performance Comparisons
For a double star, RG is clearly the best. NJ is poor in recovering HMM. CLGrouping performs well in all three structures. Average running time for CLGrouping < 1 second.

41 Monthly Stock Returns Monthly returns of 84 companies in S&P 100.
Samples from 1990 to 2007. Latent tree learned using CLNJ.

42

43

44

45

46 20 Newsgroups with 100 Words 16,242 binary samples of 100 words
Latent tree learned using regCLRG. The dataset consists of binary samples of 100 words, indicating whether each word appears in each article or not.

47

48

49

50

51 Contributions Recursive-grouping CLGrouping
Identifies families and introduces hidden nodes recursively. CLGrouping First learns the Chow-Liu tree Then applies latent-tree-learning subroutines locally. We introduced two algorithms to learn latent trees. Recursive grouping identifies families at each layer of the tree and introduces hidden nodes recursively. CLGrouping first constructs the Chow-Liu tree among observed nodes and then applies other latent-tree-leraning subroutines such as RG or NJ locally to each internal node and its neighbors.

52 Contributions Recursive-grouping CLGrouping
Consistent. CLGrouping - superior experimental results in both accuracy and computational efficiency. Longer version of the paper and MATLAB implementation available at the project webpage. Both algorithms are consistent, and CLGrouping in particular shows superior performance in both accuracy and computational efficiency. In our longer version of the paper, you can see the comparison of our algorithms to other existing methods in more detail, and also see the proofs for all our theoretical results. This paper and MATLAB implementation of algorithms are available on the project website.

53 Testing Node Relationships
constant for all k ≠ i, j .

54 Edge Contractions Contract edge (h1, x1)
Contract edge (h1, x1) and then (h2, x1)

55 CLGrouping and Edge Contractions
Chow-Liu tree = Obtained by contracting each hidden node and its surrogate node. CLGrouping reverses edge contraction operations. (a) (b) (c) (d)

56 Desired Properties for Algorithms

57 Desired Properties for Algorithms
Consistency Correct recovery given exact distributions.

58 Desired Properties for Algorithms
Consistency Correct recovery given exact distributions. Computational Efficiency

59 Desired Properties for Algorithms
Consistency Correct recovery given exact distributions. Computational Efficiency Sample Complexity

60 CLGrouping Algorithm In this subtree, h3 has only two neighbors so it will not be recovered. Step 3. Replace the output of RG with the sub-tree spanning the neighborhood.

61 CLGrouping Algorithm In this subtree, h3 has only two neighbors so it will not be recovered. The remaining tree has 2 hidden nodes and since RG recovers the correct minimal latent tree, this is the output of RG. Step 3. Replace the output of RG with the sub-tree spanning the neighborhood.


Download ppt "Consistent and Efficient Reconstruction of Latent Tree Models"

Similar presentations


Ads by Google