Presentation is loading. Please wait.

Presentation is loading. Please wait.

Forest Learning from Data

Similar presentations


Presentation on theme: "Forest Learning from Data"— Presentation transcript:

1 Forest Learning from Data
Joe Suzuki July 17, 2017

2 Road Map PART-I: July 17, 2017 A Bayesian Approach to Data Compression
PART-II: July 24, (based on PART-I) Estimating Mutual Information (15 mins) Learning Forests from Data (25 mins) Learning Bayesian Networks from Data (5 mins) Exercise (45 mins)

3 Entropy

4 Mutual Information (MI)

5 Correlation may not detect independence!

6 ML Estimator of MI

7

8 Bayesian Testing of Independence

9

10 Bayesian Estimation of MI
From Stirling’s formula For large n

11 Experiments 500 trials for binary seq. of length n=200

12 BNSL: a CRAN package (J. Suzuki and J. Kawahara, 2017)
Bayesian Network Learning Structure collects research results by Joe Suzuki. install(“BNSL”) library(BNSL) n=200; p=0.5; x=rbinom(n,1,p); y=rbinom(n,1,p) # seqs are generated mi(x,y, proc=9) # I_n mi(x,y) # J_n

13 Tree Approximation

14 Factorization w.r.t. A Tree

15 Find E s.t. D(P||P’) is minimized

16 Kruskal’s Algorithm

17 Chow-Liu Algorithm

18

19

20

21

22 Experiments using Asia data set
library(BNSL) mm=mi_matrix(asia, proc=9) # I_n is used edge.list=kruskal(mm) g=graph_from_edgelist(edge.list, directed=FALSE) plot(g) mm=mi_matrix(asia) # J_n is used

23 Asia (8 variables) S. Lauritzen, D. Spiegelhalter. Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 50(2): , 1988

24 Asia Data Set

25 Alarm (37 varibles) I. A. Beinlich, H. J. Suermondt, R. M. Chavez, and G. F. Cooper. The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, pages Springer-Verlag, 1989.

26 Alarm Data Set

27 Learning Bayesian Networks from Data
The # of candidate structures with p nodes is more than exponential with p

28 25 DAGs exist for p=3 but only 11 BNs are considered

29

30 7 local scores and 11 global scores

31 Summary Estimating Mutual Information Learning Forests from Data
Learning Bayesian Networks from Data

32 Problem Set #2


Download ppt "Forest Learning from Data"

Similar presentations


Ads by Google