Presentation on theme: "Learning Bayes Nets Based on Conditional Dependencies Oliver Schulte Simon Fraser University Vancouver, Canada ` with Wei."— Presentation transcript:
Learning Bayes Nets Based on Conditional Dependencies Oliver Schulte Simon Fraser University Vancouver, Canada ` with Wei Luo (SFU) and Russ Greiner (U of Alberta)
Learning Bayes Nets Based on Conditional Dependencies 2/20 Outline 1. Brief Intro to Bayes Nets 2. Combining Dependency Information with Model Selection 3. System Architecture 4. Theoretical Analysis and Simulation Results
Learning Bayes Nets Based on Conditional Dependencies 3/20 Bayes Nets: Overview Bayes Net Structure = Directed Acyclic Graph. Nodes = Variables of Interest. Arcs = direct “influence”, “association”. Parameters = CP Tables = Prob of Child given Parents. Structure represents (in)dependencies. Structure + parameters represents joint probability distribution over variables.
Learning Bayes Nets Based on Conditional Dependencies 4/20 Examples from CIspace (UBC)
Learning Bayes Nets Based on Conditional Dependencies 5/20 Graphs entail Dependencies A B C A B C A B C Dep(A,B),Dep(A,B|C) Dep(A,B),Dep(A,B|C), Dep(B,C),Dep(B,C|A), Dep(A,C|B)
Learning Bayes Nets Based on Conditional Dependencies 6/20 I-maps and Probability Distributions Defn Graph G is an I-map of prob dist P If Dependent(X,Y|S) in P, then X is d-connected to Y given S in G. Example: If Dependent(Father Eye Color,Mother Eye Color|Child Eye Color) in P, then Father EC is d-connected to Mother EC given Child EC in G. G is an I-map of P G entails all conditional dependencies in P. Theorem Fix G,P. There is a parameter setting for G such that (G, ) represents P G is an I-map of P.
Learning Bayes Nets Based on Conditional Dependencies 7/20 Two Approaches to Learning Bayes Net Structure select graph G as “model” with parameters to be estimated “search and score” find G that represents (in)dependencies in P test for dependencies, cover Aim: find G that represents P with suitable parameters
Learning Bayes Nets Based on Conditional Dependencies 8/20 Our Hybrid Approach Sample Set of (In)Dependencies Final Output Graph The final selected graph maximizes a model selection score and covers all observed (in)dependencies.
Definition of Hybrid Criterion Let d be a sample. Let S(G,d) be a score function. A B C Case 1Case 2Case 3 S 10.5 Let Dep be a set of conditional dependencies extracted from sample d. Graph G optimizes score S given Dep, sample d G entails the dependencies Dep, and 1. if any other graph G’ entails Dep, then score(G,d) ≥ score(G’,d).
Learning Bayes Nets Based on Conditional Dependencies 10/20 Local Search Heuristics for Constrained Search There is a general method for adapting any local search heuristic to accommodate observed dependencies. Will present adaptation of GES search - call it IGES.
Learning Bayes Nets Based on Conditional Dependencies 11/20 GES Search (Meek, Chickering) B C A B C A B C A B C A B C A Grow Phase: Add Edges Shrink Phase: Delete Edges Score = 5 Score = 7 Score = 8 Score = 8.5 Score = 9
Learning Bayes Nets Based on Conditional Dependencies 12/20 IGES Search Case 1Case 2Case 3 Step 1: Extract Dependencies From Sample Testing Procedure Dependencies 1.Continue with Growth Phase until all dependencies are covered. 2.During Shrink Phase, delete edge only if dependencies are still covered. B C A Score = 7 B C A Score = 5 given Dep(A,B)
Asymptotic Equivalence GES = IGES Theorem Assume that score function S is consistent and that joint probability distribution P satisfies the composition principle. Let Dep be a set of dependencies true of P. Then with P-probability 1, GES and IGES+Dep converge to the same output in the sample size limit. So IGES inherits the convergence properties of GES.
Learning Bayes Nets Based on Conditional Dependencies 14/20 Extracting Dependencies We use 2 test (with cell coverage condition) Exhaustive testing of all triples Indep(X,Y|S) for cardinality(S) < k chosen by user More sophisticated testing strategy coming soon.
Learning Bayes Nets Based on Conditional Dependencies 15/20 Simulation Setup: Methods The hybrid approach is a general schema. Our Setup Statistical Test: 2, sign. 5% Score S: Bdeu (with Tetrad default settings) Search Method: GES, adapted
Simulation Setup: Graphs and Data Random DAGs with binary variables. #Nodes: 4,6,8,10. Sample Sizes 100, 200, 400, 800, 1600, 3200, 6400, 12800, random samples per graph per sample size, average results. Graphs generated with Tetrad’s random DAG utility.
Show Some Graphs
Conclusion for I-map learning: The Underfitting Zone Although not explicitly designed to cover statistically significant correlations, GES+BDeu does so pretty well. But not perfectly, so IGES helps to add in missing edges (on the order of 5) for node 10 graphs. sample size small: little significance medium: underfitting of correlations large: convergence zone Diver- gence from True Graph standard search + score constrained S + S
Learning Bayes Nets Based on Conditional Dependencies 19/20 Future Work: More Efficient Testing Strategy Say that a graph G satisfies the Markov condition wrt sample d for all X, Y, if Y is nonparental nondescendant of X, then we do not find Dep(X,Y|parents(X). Given sample d, look for graph G that maximizes score and satisfies the MC wrt d. Requires only (#Var) 2 tests.
Learning Bayes Nets Based on Conditional Dependencies 20/20 Summary: Hybrid Criterion - test, search and score. Basic Idea: Base Bayes net learning on dependencies that can be reliably obtained even on small to medium sample sizes. Hybrid criterion: find graph that maximizes model selection score given the constraint of entailing statistically significant dependencies or correlations. Theory + Simulation evidence suggests that this: speeds up convergence to correct graph addresses underfitting on small-medium samples. THE END