Download presentation
Presentation is loading. Please wait.
Published byTristin Hansard Modified over 10 years ago
1
Empirical Algorithmics Reading Group Oct 11, 2007 Tuning Search Algorithms for Real-World Applications: A Regression Tree Based Approach by Thomas Bartz-Beielstein & Sandor Markon Presenter: Frank Hutter
2
Motivation How to find a set of working parameters for direct search algorithms when the number of allowed epxeriments is low –i.e. find good parameters with few evaluations Taking a users perspective: –Adopt standard params from the literature –But NFL theorem: cant do good everywhere –Tune for instance class / for optimization instances even on a single instance
3
Considered approaches Regression analysis ANOVA DACE CART
4
Elevator Group Control Multi-objective problem –Overall service quality –Traffic throughput –Energy consumption –Transport capacity –Many more … Here: only one objective –Minimize time customers have to wait until they can enter the elevator car
5
Optimization via Simulation Goal: Optimize expected performance E[y(x 1,…, x n )] (x 1,…, x n controllable) Black box function y
6
Direct search algorithms Do not construct a model of the fitness function Interesting aside: same nomenclature as I use, but independent Here –Evolution strategy (special class of evolutionary algorithm) –Simulated annealing
7
Evolution strategies (ES) Start out with parental population at t=0 For each new generation: –Create offsprings Select parent family of size \rho at random Apply recombination to object variables (?) and strategy parameters (?) –Mutation of each offspring –Selection
8
Many parameters in ES Number of parent individuals Number of offspring individuals Initial mean step sizes ( i ) –Can choose problem-specific, different i for each dimension (not done here) Number of standard deviations (??) Mutation strength (global/individual, extended log-normal rule ??) Mixing number (size of each parent family) Recombination operator –For object variables –For strategy variables Selection mechanims, maximum life span Plus-strategies ( + ) and comma-strategies (, ) Can be generalized by (maximum age of individual)
9
Simulated Annealing Proposal: Gaussian Markov kernel with scale proportional to the temperature Decrease temperature on a logarithmic cooling schedule Two parameters –Starting temperature –Number of function evaluations at each temperature
10
Experimental Analysis of Search Heuristics Which parameters have the greatest effect? –Screening Which parameter setting might lead to an improved performance –Modelling –Optimization
11
Design of experiments (DOE) Choose two factors for each parameter –Both qualitative and quantitative 2 k-p fractional factorial design –2: number of levels for each factor –K parameters –Only 2 k-p experiments –Can be generated from a full factorial design on k-p params –Resolution = (k-p) +1 (is this always the case?) Resolution 2: not useful – main effects are confounded with each other Resolution 3: often used, main effects are unconfounded with each other Resolution 4: all main effects are unconfounded with all 2-factor interactions Resolution 5: all 2-factor interactions are unconfounded with each other Here: 2 III 9-5 fractional factorial design
12
Regression analysis Using stepAIC function built into R –Akaikes information criterion to penalize many parameters in the model –Line search to improve algorithms performance (?)
13
Tree based regression Used for screening Based on the fractional factorial design Forward growing –Splitting criterion: minimal variance within the two children –Backward pruning: snipping away branches to maximize penalized cost Using rpart implementation from R –10-fold cross validation –1-SE rule: mean + 1stddev as pessimistic estimate –Threshold complexity parameter: visually chosen based on 1-SE rule
14
Experimental results 5000 fitness evaluations as termination criterion Initialization already finds good parameters ! only small improvements possible Actual results not too important, but methods! Questions –Is strategy useful? –Improve parameters –Which analysis strategy works?
15
Two splits (, ): Regression analysis:only first split significant Tuned algorithm found solution with quality y=32.252 –Which parameter settings? –What does 32.252 mean? –How about multiple runs? strategy useful? regression tree analysis
16
New Gupta vs. classical + selection Tune old and new variants Report new results and runtime for tuning –Just that they do not report the runtime for tuning
17
Comparison of approaches on Simulated Annealing Only two (continuous) parameters Classical regression fails –No significant effects Regression tree –Best around 10,10 –Based on a full-factorial design with 2 levels each this is pretty shaky
18
Comparison of approaches E.g. regression trees for screening, then DACE if only a few continuous parameters remain (why the restriction to few?)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.