Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: 53-84 Martin Pelikan, Kumara.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Random Forest Predrag Radenković 3237/10
How to Schedule a Cascade in an Arbitrary Graph F. Chierchetti, J. Kleinberg, A. Panconesi February 2012 Presented by Emrah Cem 7301 – Advances in Social.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Graduate Center/City University of New York University of Helsinki FINDING OPTIMAL BAYESIAN NETWORK STRUCTURES WITH CONSTRAINTS LEARNED FROM DATA Xiannian.
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
Linkage Problem, Distribution Estimation, and Bayesian Networks Evolutionary Computation 8(3) Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz.
1 Structure of search space, complexity of stochastic combinatorial optimization algorithms and application to biological motifs discovery Robin Gras INRIA.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
Estimation of Distribution Algorithms Ata Kaban School of Computer Science The University of Birmingham.
Lecture 5: Learning models using EM
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Hierarchical Allelic Pairwise Independent Function by DAVID ICLĂNZAN Present by Tsung-Yu Ho At Teilab,
Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions Matthew J. Streeter Carnegie Mellon University Pittsburgh, PA.
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Experimental Evaluation
UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in Unstructured P2P Networks Distributed Systems Research Seminar on Mikko Vapa, research student.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Decision Tree Models in Data Mining
Image Registration of Very Large Images via Genetic Programming Sarit Chicotay Omid E. David Nathan S. Netanyahu CVPR ‘14 Workshop on Registration of Very.
Efficient Model Selection for Support Vector Machines
6. Experimental Analysis Visible Boltzmann machine with higher-order potentials: Conditional random field (CRF): Exponential random graph model (ERGM):
1 Chapter 24 Developing Efficient Algorithms. 2 Executing Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
Network Aware Resource Allocation in Distributed Clouds.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Estimation of Distribution Algorithms (EDA)
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Neural and Evolutionary Computing - Lecture 6
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Simulated Annealing.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
The Application of The Improved Hybrid Ant Colony Algorithm in Vehicle Routing Optimization Problem International Conference on Future Computer and Communication,
Siddhartha Shakya1 Estimation Of Distribution Algorithm based on Markov Random Fields Siddhartha Shakya School Of Computing The Robert Gordon.
Heuristic Optimization Methods Greedy algorithms, Approximation algorithms, and GRASP.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Evaluating Network Security with Two-Layer Attack Graphs Anming Xie Zhuhua Cai Cong Tang Jianbin Hu Zhong Chen ACSAC (Dec., 2009) 2010/6/151.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
A Membrane Algorithm for the Min Storage problem Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano – Bicocca WMC.
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
John Lafferty Andrew McCallum Fernando Pereira
Ricochet Robots Mitch Powell Daniel Tilgner. Abstract Ricochet robots is a board game created in Germany in A player is given 30 seconds to find.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
Univariate Gaussian Case (Cont.)
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Bias Management in Time Changing Data Streams We assume data is generated randomly according to a stationary distribution. Data comes in the form of streams.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Learning Deep Generative Models by Ruslan Salakhutdinov
CS 326A: Motion Planning Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (1996) L. Kavraki, P. Švestka, J.-C. Latombe,
The minimum cost flow problem
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Hidden Markov Models Part 2: Algorithms
Bayesian Models in Machine Learning
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Boltzmann Machine (BM) (§6.4)
LECTURE 09: BAYESIAN LEARNING
Presentation transcript:

Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara Sastry, David E. Goldberg Summarized by Seok Ho-Sik 1

 Question How could the time complexity of model building of multivariate probabilistic method be decreased ?  Observation Evolution process for hBOA is composed of heavy computational process (structure learning) and light computational process (parameter learning).  Insight Let’s reduce the computational burden by updating the structure of the probabilistic model once in every few iterations (generations). Core idea 2

 This paper describes and analyzes sporadic model building.  This paper also presents a dimensional model to provide a heuristic for scaling the structure-building period. Contributions 3

 Bayesian networks Structure: an acyclic directed graph with one node per variable and the edges corresponding to conditional probabilities (complexity: ). Parameters: the conditional probabilities of each variable given the variables that this variable depends on (complexity: ). To learn the structure of a Bayesian network with decision tree, a simple greedy algorithm is used. The greedy algorithm starts with an empty network, which is represented by single-node decision tree. Each iteration splits one leaf of any decision tree that improves the score of the network. hBOA 4

 The hierarchical Bayesian optimization (hBOA) algorithm can solve nearly decomposable and hierarchical optimization problems in a quadratic number of evaluations.  However, even quadratic performance may be insufficient for problems with many variables.  Q 1 : what is the computational bottleneck? A: structure learning of the Bayesian network  how could the computational burden be reduced?  Q 2 : how often should the structure be updated? A: once in every few iteration  how to automatically set the period? Sporadic model building Situations 5

 Although the model structure does not remain the same throughout the run, the structure in consequent iterations of hBOA are often similar because consequent populations have similar structure with respect to the conditional dependencies and independencies.  SMB builds the structure once in every few iterations, while in remaining iterations only the parameters are updated.  Question: when to build the structure and when to only update the parameters? A: the structure-building period t sb  the period when the structure is updated. Sporadic model building (SMB) 6

 1) Speed up of model building Ideally, the speedup of building the model structure would be linearly proportional to t sb. SMB may lead to an increase of the population size and the number of iteration.  2) Slowdown of evaluation SMB may lead to a decreased accuracy of the probabilistic models used to sample new candidates solutions  the population size N for building a sufficiently accurate model may increase with t sb. Because of a less frequent adaptation of the model structure to the current population, SMB may slow down the convergence and the total number of iterations may increase with t sb. Effects of SMB on BOA 7

 Dec -3: concatenated 3-bit deceptive function The input string is partitioned into independent groups of 3 bits A 3-bit deceptive function is applied to each group of 3 bits and the contributions of all deceptive functions are added to form the fitness.  Trap-5: concatenated 5-bit trap Instead of 3-bit groups of Dec-3, 5-bit groups are considered.  hTrap: hierarchical Trap Dec-3 and Trap-5 are separable problems of a fixed order. hTrap constitute a problem that should be solved through hierarchical approach. Test problems 8

 Problem size n = with step 15 for dec-3 and trap-5.  n=9, 27,81, 243 for hTrap.  hBOA with SMB was tested for t sb from 1 to 20.  Each run was terminated either when the global optimum was found or when hBOA completed a large number of generations. Experimental methodology 9

 Speedup Ideal case: the structure building speed-up would be the ratio of the number of times the structure was learned without SMB and the number of times the structure was learned with SMB. SMB may lead to larger population size  real speed up =  Evaluation slowdown Results (1/4) 10

Results (2/4) 11

Results (3/4) 12

 SMB leads to a significant decrease of the asymptotic time complexity.  The optimal speedup obtained with SMB grows with problem size.  The facto by which SMB increases the number of evaluations is insignificant compared to the speedup of model building.  SMB with a constant structure-building period does not lead to an increase of the asymptotic complexity of hBOA with respect to the number of evaluations until convergence. Result (4/4) 13

 The role of dimensional model: estimates the growth of t sb and the achieved speedup.  The basic ides Assumption: the most of the subproblems have been captured by the current structure. To compute the maximum number of iterations without structural updates to ensure that the model is rebuilt before the incorrectly modeled subproblems converge to non-optimal values. To provide a worst-case analysis, the authors assume a situation where decision variables in the incorrectly modeled subproblems converge to non-optimal values. Dimensional model for separable problems of bounded order 14

 The structure-building period should grow at most as fast as the square root of the number of subproblems.  Gambler’s ruin model: modeling the growth of inferior partial solutions A gamble starts with some initial capital x 0, an opponent with an initial capital N- x 0. The total number of tournaments to reach an absorbing state, t A Population sizing 15 (1) (2) (3) (4) (5) (6) N: number of tournaments

 Question Whether will SMB and the rule for scaling t sb achieve high speedups even on problems that are not trivially decomposable into subproblems of bounded order?  Ising spin glass Each node i : spin s i Possible state of a spin: ±1 Edge and J i,j : real value J i,j defines the relationship between the two connected spins s i and s j. Energy: Assuming the Boltzmann distribution of spin configurations. Ising spin glasses (1/3) 16

 The structure-building speedup grows with problem size.  For difficult problems, the evaluation slowdown with SMB becomes more significant than for separable problems, and that it grows with problem size. Ising spin glasses (2/3) 17

Ising spin glasses (3/3) 18

 Sporadic model building leads to significant improvements of asymptotic time complexity of hBOA model building.  The slowdown due to an increase of the evalutions is insignificant compared to the model-building speedup.  Sporadic model building does not lead to an increase of the asymptotic complexity of hBOA with respect to the number of evaluations. Summary and conclusions 19

 Statically determined t sb is not enough  Population sizing by gamble’s ruin model seems not to be sufficient for determining t sb.  Lower bound and upper bound should be provided.  Dynamically adjusted t sb would be more superior than the introduced method. The computational time for parameter learning is likely to be varied according to the population. My questions on this paper 20