Download presentation

Presentation is loading. Please wait.

Published byAntonia Potter Modified about 1 year ago

1
Experiments We measured the times(s) and number of expanded nodes to previous heuristic using BFBnB. Dynamic Programming Intuition. All DAGs must have a leaf. Optimal networks for a single variable are trivial. Recursively add new leaves and select optimal parents until adding all variables. All orderings have to be considered. Recurrences. Bayesian Network Structure Learning Representation. Joint probability distribution over a set of variables. Structure. DAG storing conditional dependencies. Vertices correspond to variables. Edges indicate relationships among variables. Parameters. Conditional probability distributions. Learning. Find the network with the minimal score for complete dataset D. We often omit D for brevity. Begin with a single variable. Pick one variable as leaf. Find its optimal parents. Pick another leaf. Find its optimal parents from current. Continue picking leaves and finding optimal parents. Graph Search Formulation The dynamic programming can be visualized as a search through an order graph. The Order Graph Calculation. Score(U), best subnetwork for U. Node. Score(U) for U. Successor. Add X as a leaf to U. Path. Induces an ordering on variables. Size. 2 n nodes, one for each subset. Admissible Heuristic Search Formulation Start Node. Top node, {}. Goal Node. Bottom node, V. Shortest Path. Corresponds to optimal structure. g(U). Score(U). h(U). Relax acyclicity. Tightening Lower Bound The lower bound was calculated from a pattern database heuristic called k-cycle conflict heuristic. Particularly, Static k-cycle conflict pattern database was shown to have a good performance. Computing k-Cycle Conflict Heuristic. Its main idea is to relax the acyclicity constraint between groups of variables; acyclicity is enforced among the variables within each group. For a 8-variable problem, partition all the variables by Simple Grouping (SG) into two groups: G 1 ={X 1, X 2, X 3, X 4 }, G 2 ={X 5, X 6, X 7, X 8 }. We created the pattern databases with a backward breadth-first search in the order graph for each group. P 1 = h 1 ({X 2,X 3 }) = BestScore(X 2, {X 1, X 4 }U G 2 ) + BestScore(X 3, {X 1, X 2, X 4 }U G 2 ) Selected References 1.Yuan, C.; Malone, B.; and Wu, X Learning optimal Bayesian networks using A* search. In IJCAI ‘11, Malone, B.; Yuan,C Improving the Scalability of Optimal Bayesian Network Learning with Frontier Breadth- First Branch and Bound Search. In UAI’11, Felner, A.; Korf, R. E.; and Hanan, S Additive pattern database heuristics. Journal of Artificial Intelligence Research (JAIR) vol Malone, B, Yuan, C Evaluating Anytime Algorithms for Learning Optimal Bayesian Networks. In UAI’13, A recent breadth-first branch and bound algorithm (BFBnB) for learning Bayesian network structures (Malone et al. 2011) uses two bounds to prune the search space for better efficiency; one is a lower bound calculated from pattern database heuristics, and the other is an upper bound obtained by a hill climbing search. Whenever the lower bound of a search path exceeds its upper bound, the path is guaranteed to lead to suboptimal solutions and is discarded immediately. This paper introduces methods for tightening the bounds. The lower bound is tightened by using more informed variable groupings in creating the pattern databases, and the upper bound is tightened using an anytime learning algorithm. Empirical results show that these bounds improve the efficiency of Bayesian network learning by two to three orders of magnitude. TIGHTENING BOUNDS FOR BAYESIAN NETWORKS STRUCTURE LEARNING Xiannian Fan, Changhe Yuan and Brandon Malone Tightening Upper Bound Anytime window A* (AWA*) was shown to find high quality, often optimal, solutions very quickly, thus provided a tight upper bound. More Informed Grouping Strategies Rather than use SG (1 st half VS 2 nd half grouping), we developed more informed grouping strategies. 1. Maximizing the correlation between the variables within each group, and Minimize the correlation between groups. a)Family Grouping (FG): We created a correlation graph by Max-Min Parent Children (MMPC) algorithm, and gave weights by negative p-value; then performed graph partition. b)Parents Grouping (PG): We created a correlation graph by only considering the optimal parent set out of all the other variables for each variable, and gave weights by negative p-value; then performed graph partition. 2. Using Topological Ordering Information. a)Topology Grouping (TG): We created a correlation graph by considering the topological ordering of an anytime Bayesian Network solution by AWA*, then partitioned the variables according to the ordering. P 2 = h 2 ({X 5,X 7 }) = BestScore(X 5, {X 6, X 8 } U G 1 ) + BestScore(X 7, {X 5, X 6, X 8 } U G 1 ) Additive Pattern database heuristic : h({X 2,X 3,X 5,X 7 }) = h 1 ({X 2,X 3 })+h 2 ({X 1, X 4 }))= P 1 + P 2 E.g., how to calculate the heuristic for pattern {X 2,X 3,X 5,X 7 }? The effect of upper bounds generated by running AWA* for different amount of time on the performance of BFBnB search. The effect of different grouping strategies on the number of expanded nodes and time. The four grouping methods are the simple grouping (SG), FG, PG, and TG.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google