Presentation on theme: "Learning Bayes Nets Based on Conditional Dependencies Oliver Schulte Simon Fraser University Vancouver, Canada ` with Wei."— Presentation transcript:
Learning Bayes Nets Based on Conditional Dependencies Oliver Schulte Simon Fraser University Vancouver, Canada firstname.lastname@example.org@sfu.ca ` with Wei Luo (SFU) and Russ Greiner (U of Alberta)
Learning Bayes Nets Based on Conditional Dependencies 2/20 Outline 1. Brief Intro to Bayes Nets 2. Combining Dependency Information with Model Selection 3. System Architecture 4. Theoretical Analysis and Simulation Results
Learning Bayes Nets Based on Conditional Dependencies 3/20 Bayes Nets: Overview Bayes Net Structure = Directed Acyclic Graph. Nodes = Variables of Interest. Arcs = direct “influence”, “association”. Parameters = CP Tables = Prob of Child given Parents. Structure represents (in)dependencies. Structure + parameters represents joint probability distribution over variables.
Learning Bayes Nets Based on Conditional Dependencies 4/20 Examples from CIspace (UBC)
Learning Bayes Nets Based on Conditional Dependencies 5/20 Graphs entail Dependencies A B C A B C A B C Dep(A,B),Dep(A,B|C) Dep(A,B),Dep(A,B|C), Dep(B,C),Dep(B,C|A), Dep(A,C|B)
Learning Bayes Nets Based on Conditional Dependencies 6/20 I-maps and Probability Distributions Defn Graph G is an I-map of prob dist P If Dependent(X,Y|S) in P, then X is d-connected to Y given S in G. Example: If Dependent(Father Eye Color,Mother Eye Color|Child Eye Color) in P, then Father EC is d-connected to Mother EC given Child EC in G. G is an I-map of P G entails all conditional dependencies in P. Theorem Fix G,P. There is a parameter setting for G such that (G, ) represents P G is an I-map of P.
Learning Bayes Nets Based on Conditional Dependencies 7/20 Two Approaches to Learning Bayes Net Structure select graph G as “model” with parameters to be estimated “search and score” find G that represents (in)dependencies in P test for dependencies, cover Aim: find G that represents P with suitable parameters
Learning Bayes Nets Based on Conditional Dependencies 8/20 Our Hybrid Approach Sample Set of (In)Dependencies Final Output Graph The final selected graph maximizes a model selection score and covers all observed (in)dependencies.
Definition of Hybrid Criterion Let d be a sample. Let S(G,d) be a score function. A B C Case 1Case 2Case 3 S 10.5 Let Dep be a set of conditional dependencies extracted from sample d. Graph G optimizes score S given Dep, sample d G entails the dependencies Dep, and 1. if any other graph G’ entails Dep, then score(G,d) ≥ score(G’,d).
Learning Bayes Nets Based on Conditional Dependencies 10/20 Local Search Heuristics for Constrained Search There is a general method for adapting any local search heuristic to accommodate observed dependencies. Will present adaptation of GES search - call it IGES.
Learning Bayes Nets Based on Conditional Dependencies 11/20 GES Search (Meek, Chickering) B C A B C A B C A B C A B C A Grow Phase: Add Edges Shrink Phase: Delete Edges Score = 5 Score = 7 Score = 8 Score = 8.5 Score = 9
Learning Bayes Nets Based on Conditional Dependencies 12/20 IGES Search Case 1Case 2Case 3 Step 1: Extract Dependencies From Sample Testing Procedure Dependencies 1.Continue with Growth Phase until all dependencies are covered. 2.During Shrink Phase, delete edge only if dependencies are still covered. B C A Score = 7 B C A Score = 5 given Dep(A,B)
Asymptotic Equivalence GES = IGES Theorem Assume that score function S is consistent and that joint probability distribution P satisfies the composition principle. Let Dep be a set of dependencies true of P. Then with P-probability 1, GES and IGES+Dep converge to the same output in the sample size limit. So IGES inherits the convergence properties of GES.
Learning Bayes Nets Based on Conditional Dependencies 14/20 Extracting Dependencies We use 2 test (with cell coverage condition) Exhaustive testing of all triples Indep(X,Y|S) for cardinality(S) < k chosen by user More sophisticated testing strategy coming soon.
Learning Bayes Nets Based on Conditional Dependencies 15/20 Simulation Setup: Methods The hybrid approach is a general schema. Our Setup Statistical Test: 2, sign. 5% Score S: Bdeu (with Tetrad default settings) Search Method: GES, adapted
Simulation Setup: Graphs and Data Random DAGs with binary variables. #Nodes: 4,6,8,10. Sample Sizes 100, 200, 400, 800, 1600, 3200, 6400, 12800, 25600. 10 random samples per graph per sample size, average results. Graphs generated with Tetrad’s random DAG utility.
Conclusion for I-map learning: The Underfitting Zone Although not explicitly designed to cover statistically significant correlations, GES+BDeu does so pretty well. But not perfectly, so IGES helps to add in missing edges (on the order of 5) for node 10 graphs. sample size small: little significance medium: underfitting of correlations large: convergence zone Diver- gence from True Graph standard search + score constrained S + S
Learning Bayes Nets Based on Conditional Dependencies 19/20 Future Work: More Efficient Testing Strategy Say that a graph G satisfies the Markov condition wrt sample d for all X, Y, if Y is nonparental nondescendant of X, then we do not find Dep(X,Y|parents(X). Given sample d, look for graph G that maximizes score and satisfies the MC wrt d. Requires only (#Var) 2 tests.
Learning Bayes Nets Based on Conditional Dependencies 20/20 Summary: Hybrid Criterion - test, search and score. Basic Idea: Base Bayes net learning on dependencies that can be reliably obtained even on small to medium sample sizes. Hybrid criterion: find graph that maximizes model selection score given the constraint of entailing statistically significant dependencies or correlations. Theory + Simulation evidence suggests that this: speeds up convergence to correct graph addresses underfitting on small-medium samples. THE END