Download presentation

Presentation is loading. Please wait.

Published byMoriah Sproat Modified over 2 years ago

1
ABSTRACT: We examine how to detect hidden variables when learning probabilistic models. This problem is crucial for for improving our understanding of the domain and as a preliminary step that guides the learning procedure. A natural approach is to search for ``structural signatures'' of hidden variables. We make this basic idea concrete, and show how to integrate it with structure-search algorithms. We evaluate this method on several synthetic and real-life datasets, and show that it performs surprisingly well. Summary and Future Work We introduced the importance of hidden variables and implemented a natural idea to detect them. FindHidden performed surprisingly well and proved extremely useful as a preliminary step to a learning algorithm. Further extensions: Experiment with multi-valued hidden variables Explore additional structural signatures Use additional information such as edge confidence Detect hidden variables when the data is sparse Explore hidden variables in Probabilistic Relational Models Detecting Hidden Variables: A Structure-Based Approach PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP PCWP CO HRBP HREKG HRSAT ERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP HR PCWP CO HRBP HREKG HRSATERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP Hidden HR PCWP CO HRBP HREKG HRSAT ERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP Hidden HR Real-life example: Stockdata HIDDEN (MARKET TREND) all other nodes MICROSOFTDELL3ComCOMPAQ market trend: Strong vs. Stationary Why hidden variables? X1 X2X3 Y1Y2 H Y3 Representation: The I-mapminimal structure which implies only independencies that hold in the marginal distributionis typically complex Improve Learning: Detecting approximate position is crucial pre-processing for the EM algorithm Understanding: A true hidden variable improves the quality and order of the explanation X1 X2X3 Y1Y2Y3 not introducing new independencies Gal Elidan, Noam Lotner, Nir Friedman Hebrew University {galel,noaml,nir}@huji.ac.il Daphne Koller Stanford University koller@huji.ac.il The Alarm network HR is hidden and structure learned from data FindHidden breaks clique EM adapts structure Propose a candidate network: (1) Introduce H as a parent of all nodes in S (2) Replace all incoming edges to S by edges to H (3) Remove all inter- S edges (4) Make all children of S children of H if acyclic The FindHidden Algorithm Semi-Clique S with N nodes Search for semi-cliques by expansion of 3-clique seeds Reference: network with no hidden. Original: golden model for artificial datasets; best on test data. Naive: hidden parent of all nodes; acts as a straw-man. Hidden: best FindHidden network; outperforms Naive and Reference, excels Original on training data. Efficient Frozen EM performs as well as inefficient Flexible EM. M-Step: Score & Parameterize Learning: Structural EM Training Data X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 + E-Step: Computation X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 Expected Counts N(X 1 ) N(X 2 ) N(X 3 ) N(H, X 1, X 1, X 3 )... re-iterate with best candidate Bayesian scoring metric: A Bayesian network represents a joint probability over a set of random variables using a DAG : What is a Bayesian Network Visit to Asia Smoking Lung Cancer Tuberculosis Abnormality in Chest Bronchitis X-Ray Dyspnea P(D|A,B) = 0.8 P(D|¬A,B)=0.1 P(D|A, ¬B)=0.1 P(D| ¬ A, ¬B)=0.01 P(X 1,…X n )=P(V)P(S)P(T|V) … P(X|A)P(D|A,B) Characterizing Hidden Variables H Children of H Parents of H Clique over children of H Parents of H preserve I-Map (not introducing new independencies) This following theorem helps us to detect structural signatures for the presence of hidden variables: all parents connected to all children Applying the algorithm EM was applied with Fixed structure, Frozen structure (modify only semi-clique neighborhood) and Flexible structure We choose the best scoring candidate produced by the SEM Original network Find Hidden X1 X2X3 Y1Y2 H Y3 X1 X2X3 Y1Y2 H Y3 X1 X2X3 Y1Y2 H Y3 Structural EM X1 X2X3 Y1 Y2 Y3 X1 X2 X3 Y1Y2 H Y3 Original Hidden Naive -800 -600 -400 -200 0 200 AGMV -400 -200 0 200 400 HILV -2000 -1000 0 1000 HILV 0 50 100 150 0 200 400 600 AGMV -200 0 200 400 600 HILV 0 400 800 1200 HILV 0 40 80 120 Stock Tuberculosis Insurance 1k Alarm 1kAlarm 10k Score on Training data Logloss on test data Stock Tuberculosis

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google