Multi-Scale Search for Black-Box Optimization: Theory & Algorithms Abdullah Al-Dujaili October 2016
Black-Box Optimization Recurrent topic of interest for centuries Many applications: Control/planning Machine learning Design/ manufacture Many sub-fields Convex Discrete Multi-objective Wikipedia
Black-Box Optimization A search problem through point-wise evaluations. Objective Function Zero-order (value) Closed-form High-order (gradient) Smoothness http://www-optima.amp.i.kyoto-u.ac.jp/member/student/hedar/Hedar_files/TestGO_files/image11981.jpg
Black-Box Optimization Mathematically:
Example in Graphics Geijtenbeek, Thomas, Michiel van de Panne, and A. Frank van der Stappen. "Flexible muscle-based locomotion for bipedal creatures." ACM Transactions on Graphics (TOG) 32.6 (2013): 206. The muscle routing and control parameters are optimized using the Covariance Matrix Adaptation [Hansen, 2006] black-box algorithm.
Challenges in Black-box Optimization Dimensionality Separability Modality Complexity Ruggedness Conditioning
Approaches in Black-box Optimization Passive A grid of n points Return the best point Inefficient Active Sequential decision-making Next point depends on the previous points. Exploration vs. Exploitation Objective Function Solver
Exploration vs. Exploitation Initial investigations date back to Thompson in 1933 & Robbins in 1952 Formally know as the multi-armed bandit problem. In Continuous Black-Box Optimization, divide-and-conquer partitioning trees (hierarchical bandits)
Multi-Scale Search for Black-Box Optimization Employ a divide-and-conquer partitioning tree over the search space. Assign exploration and exploitation scores for each node. Iteratively, expand nodes based on their scores
Classical Method: Lipschitzian Optimization At time t=0, interval = [a,b]
Graphical Interpretation of B-values Global Search Local Search Function Value Slope C Selected Interval Size
MSO Algorithms in Literature Lipschitzian Optimization (LO) B. O. Shubert. A sequential method seeking the global maximum of a function. SIAM Journal on Numerical Analysis, 9(3):379-388, 1972. S. Piyavskii. An algorithm for finding the absolute extremum of a function. USSR Computational Mathematics and Mathematical Physics, 12(4):57{67, 1972. Branch and Bound (BB) J. Pinter. Global optimization in action: continuous and Lipschitz optimization: algorithms, implementations and applications, volume 6. Springer Science & Business Media, 1995. Dividing RECTangles (DIRECT) Jones, D.R., Perttunen, C.D. and Stuckman, B.E., 1993. Lipschitzian optimization without the Lipschitz constant. Journal of Optimization Theory and Applications, 79(1), pp.157-181. Multilevel Coordinate Search (MCS) Huyer, W. and Neumaier, A., 1999. Global optimization by multilevel coordinate search. Journal of Global Optimization, 14(4), pp.331-355. Simultaneous Optimistic Optimization (S00) Munos, R., 2011. Optimistic optimization of deterministic functions without the knowledge of its smoothness. In Advances in neural information processing systems. Finite-Time Analysis of the above: Al-Dujaili, Abdullah, S. Suresh, and N. Sundararajan. "MSO: a framework for bound-constrained black-box global optimization algorithms." Journal of Global Optimization (2016): 1-35. Naïve Multi-Scale Search for Black-Box Optimization (this talk) Al-Dujaili, Abdullah, and S. Suresh. "A Naive multi-scale search algorithm for global optimization problems." Information Sciences 372 (2016): 294-312.
Recent Multi-scale Search Optimization (MSO) MSO has been dominantly exploratory The DIRECT algorithm may reduce to an exhaustive grid search. Some incorporates local search (exploitation) as a separate component The MCS algorithm Expensive Optimization is becoming more relevant (i.e., limited number of function evaluations) Incorporate local search (exploitation) in the MSO framework.
RECENT ALGORITHM FOR Expensive Black-Box Optimization Naïve Multi-scale Search Optimization (NMSO) Function value as exploitation score Depth as exploration score Depth-wise expansion until no further improvement is noticed and revisit the root.
No further improvement in NMSO Expand one coordinate at a depth At depth h, choose the (or one) node with the best function value to expand For the child nodes of the expanded node (h,i), compute: Decide to continue or put the child nodes in a basket for exploitation at a later stage Nodes in the basket are expanded only if they have been selected/visited in V number of sweeps while in the basket.
No further improvement in NMSO for 2D problem
Theoretical Analysis
Empirical Analysis Performance as a function of computational budget (number of function evaluations).
Demo Available @ http://ash-aldujaili.github.io/NMSO/
Limitations / Future Work The algorithm is so naïve and so many things can be explored: Large Scale (high dimensionality) Adaptive Encoding Parameter Tuning/ Update Surrogate Model