Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Objective Optimization Using Evolutionary Algorithms

Similar presentations


Presentation on theme: "Multi-Objective Optimization Using Evolutionary Algorithms"— Presentation transcript:

1 Multi-Objective Optimization Using Evolutionary Algorithms
Copyright © Dr. Yair Levy all rights reserved worldwide

2 Short Review (I) Evolution Strategies (ESs) were developed in Germany and have been extensively studied in Europe ESs use real-coding of design parameters since they model the organic evolution at the level of individual’s phenotypes. ESs depend on deterministic selection and mutation for its evolution. ESs use strategic parameters such as on-line self-adaptation of mutability parameters. The selection of parents to form offspring is less constrained than it is in genetic algorithms and genetic programming. For instance, due to the nature of the representation, it is easy to average vectors from many individuals to form a single offspring. In a typical evolutionary strategy,  parents are selected uniformly randomly (i.e., not based upon fitness), more than  offspring are generated through the use of recombination (considering ), and then  survivors are selected deterministically. The survivors are chosen either from the best  offspring (i.e., no parents survive, (,)-ES) or from the best  +  parents and offspring - (+)-ES. Copyright © Dr. Yair Levy all rights reserved worldwide

3 Short Review (II) Genetic programming and Genetic algorithms are similar in most other aspects, except that the reproduction operators are tailored to a tree representation. The most commonly used operator is subtree crossover, in which an entire subtree is swapped between two parents. In a standard genetic program, the representation used is a variable-sized tree of functions and values. Each leaf in the tree is a label from an available set of value labels. Each internal node in the tree is labeled from an available set of function labels. The entire tree corresponds to a single function that may be evaluated. Typically, the tree is evaluated in a left-most depth-first manner. A leaf is evaluated as the corresponding value. A function is evaluated using as arguments the result of the evaluation of its children. Copyright © Dr. Yair Levy all rights reserved worldwide

4 Overview Principles of Multi-Objective Optimization.
Difficulties with the classical multi-objective optimization methods. Schematic of an ideal multi-objective optimization procedure. The original Genetic Algorithm (GA). Why using GA? Multi-Objective Evolutionary Algorithm (MOEA). An example of using a MOEA for solving engineering design problem. Multi-Criterion optimization for solving real-world engineering design or decision making problems Copyright © Dr. Yair Levy all rights reserved worldwide

5 Multiobjective algorithms classification based on how the objectives are integrated within
We will use the following simple classification of Evolutionary Multi-Objective Optimization (EMOO) approaches: Non-Pareto Techniques Aggregating approaches Lexicographic ordering VEGA (Vector Evaluated Genetic Algorithm) Pareto Techniques Pure Pareto ranking MOGA NSGA Recent Approaches PAES SPEA Bio-inspired Approaches PSO Ant-colony based Algorithms that fall in this category combine (or aggregate) all the objectives of the problem into one single objective. The effect is that the multiobjective problem is transformed into a single objective one and single objective algorithms can be used. Under this category we can find: weighted-sum method, method of goal attainment. In Lexicographic ordering method the user (decision maker) has to rank all the objectives in order of their importance. Then the algorithm optimizes the first objective, then with the found solution tries to optimize the second objective, and so on. This method is sensitive to the ordering of the objectives.

6 Principles of Multi-Objective Optimization
Real-world problems have more than one objective function, each of which may have a different individual optimal solution. Different in the optimal solutions corresponding to different objectives because the objective functions are often conflicting (competing) to each other. Set of trade-off optimal solutions instead of one optimal solution, generally known as “Pareto-Optimal” solutions (named after Italian economist Vilfredo Pareto (1906)). No one solution can be considered to be better than any other with respect to all objective functions. The non-dominant solution concept. Copyright © Dr. Yair Levy all rights reserved worldwide

7 Multi-Objective Optimization
Is the optimization of different objective functions at the same time, thus at the end the algorithm return n different optimal values which is different to return one value in a normal optimization problem. Thus, there are more than one objective function Pareto - optimal solutions and Pareto - optimal front Pareto - optimal solutions: The optimal solutions found in a multiple-objective optimization problem Pareto - optimal front: the curve formed by joining all these solution (Pareto - optimal solutions) Copyright © Dr. Yair Levy all rights reserved worldwide

8 Nondominated and dominated solutions
Non-dominated -> given two objectives, a non-dominated solution is when none of both solution are better than the other with respect to two objectives. Both objectives are equally important. e.g. speed and price. Dominated: when solution a is no worse than b in all objectives, and solution a is strictly better than b in at least one objective, then solution a dominate solution b. A weakly dominated solution: when solution a is no worse than b in all objectives. Cand a nu e mai rau decat b in toate obiectivele, dar nici nu gasim un obiectiv la care a sa fie clar mai bun decat b, atunci se numeste dominanta slaba a lui a fata de b. Copyright © Dr. Yair Levy all rights reserved worldwide

9 Multi-Objective Problems: Dominance
we say x dominates y if it is at least as good on all criteria and better on at least one Dominated by x f1 f2 Pareto front x

10 Principles of Multi-Objective Optimization (cont.)
Simple car design example: two objectives - cost and accident rate – both of which are to be minimized. A multi-objective optimization algorithm must achieve: Guide the search towards the global Pareto-Optimal front. Maintain solution diversity in the Pareto-Optimal front. A, B, D - One objective can only be improved at the expense of at least one other objective! The point A represents a solution which incurs a near-minimum cost but is highly accident-prone (predispus). On the other hand, the point B represents a solution which is costly but is near least accident-prone. One cannot really say whether solution A is better than solution B, or vice versa because one solution is better than other in one objective but is worse in other. The solution C is not optimal because there exists another solution D in the search space, which is better than solution C in both objectives One cannot conclude about an absolute hierarchy of solutions A, B, D, or any other solution in the set. These solutions are known as Pareto-Optimal solutions (named after Italian economist Vilfredo Pareto (1906)) The set of the best compromise solutions is referred to as the Pareto-ideal set, characterized by the fact that starting from a solution within the set, one objective can only be improved at the expense of at least one other objective. In front of Pareto-Optimal front are un-attainable solutions corresponding to the optimal of both objectives. The area behind the Pareto-Optimal front is known as feasible search space or feasible design space. Copyright © Dr. Yair Levy all rights reserved worldwide

11 Non-Pareto Classification Techniques (Traditional Approaches)
Aggregating the objectives into a single and parameterized objective function and performing several runs with different parameter settings to achieve a set of solutions approximating the Pareto-optimal set. Weighting Method (Cohon, 1978) Constraint Method (Cohon, 1978) Goal Programming (Steuer, 1986) Minimax Approach (Koski, 1984)

12 Vector Evaluated Genetic Algorithm
Proposed by Schaffer in the mid-1980s (1984,1985). Only the selection mechanism of the GA is modified so that at each generation a number of sub-populations was generated by performing proportional selection according to each objective function in turn. Thus, for a problem with k objectives and a population size of M, k sub-populations of size M/k each would be generated. These sub-populations would be shuffled together to obtain a new population of size M, on which the GA would apply the crossover and mutation operators in the usual way.

13 Schematic of VEGA selection

14 Advantages and Disadvantages of VEGA
Efficient and easy to implement. If proportional selection is used, then the shuffling and merging of all the sub-populations corresponds to averaging the fitness components associated with each of the objectives. In other words, under these conditions, VEGA behaves as an aggregating approach and therefore, it is subject to the same problems of such techniques.

15 Problems in Multiobjectives Optimization
Weighting Method example Fitness Function = w1 F1(x) + w2 F2(x) Consider the problem for minimize response time, maximize throughput When F1(x) = response time, F2(x) = throughput Wi = weight value Then, It is hard to find the values of W1 and W2. It is hard to form a fitness function.

16 Traditional Approaches
Difficulties with classical methods: Being sensitive to the shape of the Pareto-optimal front (e.g. weighting method). Need for problem knowledge which may not be available. Restrictions on their use in some application areas. Need to several optimization runs to achieve the best parameter setting to obtain an approximation of the Pareto-optimal set.

17 Difficulties with the classical multi-objective optimization methods
Such as weighted sum, є-perturbation, goal programming, min-max, and others: Repeat many times to find multiple optimal solutions. Require some knowledge about the problem being solved. Some are sensitive to the shape of the Pareto-optimal front (e.g. non-convex). The spread of optimal solutions depends on efficiency of the single-objective optimizer. Not reliable in problems involving uncertainties or stochastic. Not efficient for problems having discrete search space. In practice, this could mean that very few alternative designs will be explored due to the time required to simulate each one. Weighted sum method cannot find Pareto-optimal solutions in problems having a non-convex Pareto-Optimal front In general, there is a single objective function. Multiple performance measures are usually handled by combining them into the objective function using appropriate weights, or by including them as constraints. For example, in an inventory problem, one must consider ordering, holding, and backlogging or lost sales. This is usually addressed in one of two (ultimately equivalent) ways: by minimizing a single cost function that has all of these components; or by minimizing a cost function consisting of ordering and holding costs, subject to a service level constraint on lost sales or backlogging. A stochastic constraint might be something like “the proportion of customers having to wait more than one minute for an operator should not exceed 1%” Copyright © Dr. Yair Levy all rights reserved worldwide

18 Lexicographic Ordering (LO)
In this method, the user is asked to rank the objectives in order of importance. The optimum solution is then obtained by minimizing the objective functions, starting with the most important one and proceeding according to the assigned order of importance of the objectives. It is also possible to select randomly a single objective to optimize at each run of a GA.

19 Advantages and Disadvantages of LO
Efficient and easy to implement. Requires a pre-defined ordering of objectives and its performance will be affected by it. Selecting randomly an objective is equivalent to a weighted combination of objectives, in which each weight is defined in terms of the probability that each objective has of being selected. However, if tournament selection is used, the technique does not behave like VEGA, because tournament selection does not require scaling of the objectives (because of its pair-wise comparisons). Therefore, the approach may work properly with concave Pareto fronts. Inappropriate when there is a large amount of objectives.

20 Schematic of an ideal Multi-Objective optimization procedure
In 1967, Rosenberg hinted the potential of Genetic Algorithms in multi-objective optimization No significant study until in 1989 Goldberg outlined a new non-dominated sorting procedure A lot of interest recently because a GA is capable of finding multiple optimum solutions in one single run (more than 630 publications in this research area) This research area remained unexplored until recently it has received a lot of interest because a GA is capable of finding multiple optimum solutions in one single run. High-level information is used to choose one of the trade-off solutions. Copyright © Dr. Yair Levy all rights reserved worldwide

21 Pareto-based Techniques
Suggested by Goldberg (1989) to solve the problems with Schaffer’s VEGA. Use of non-dominated ranking and selection to move the population towards the Pareto front. Requires a ranking procedure and a technique to maintain diversity in the population (otherwise, the GA will tend to converge to a single solution, because of the stochastic noise involved in the process).

22 The original Genetic Algorithm (GA)
Initially introduced by Holland in 1975. General-purpose heuristic search algorithm that mimic the natural selection process in order to find the optimal solutions. Generate a population of random individuals or candidate solutions to the problem at hand. Evaluate of the fitness of each individual in the population. Rank individuals based on their fitness. Select individuals with high fitness to produce the next generation. Use genetic operations crossover and mutation to generate a new population. Continue the process by going back to step 2 until the problem’s objectives are satisfied. The best individuals are allowed to survive, mate, and reproduce offspring. Evolving solutions over time leads to better solutions. Also known as the simple GA Representation: 1. The first step is to encode problem variables or individuals such as binary strings, real numbers, or complex computer code. Initialization 1. The initial population of individuals or chromosomes can be generated randomly or by using prior knowledge of possibly good solutions. 3. Without any knowledge of the problem domain, the GA begins to process population of individuals. Fitness Evaluation: 1. Each individual is assigned a fitness value as a measure how well the the individual optimizes an objective function 2. At each generation, the fitness value of each chromosome is calculated 3. The task of a GA is to find solutions that have high values among the set of all possible solutions Terminate Condition: 1. The evolution continues over a number of generations using selection and the genetic operators until a terminate condition is met. 2. There are several potential termination condition including reaching a maximum number of generation or evolving an optimal fit individual Candidate individuals are allowed to evolve over a number of generations The best individuals survive, mate and create offspring (selection, crossover, mutation operators) Evolving individuals over time leads to better population Copyright © Dr. Yair Levy all rights reserved worldwide

23 The original Genetic Algorithm (GA) – Flow Chart
A real coded GA represents parameters without coding, which makes representation of the solutions very close to the natural formulation of many problems. Special crossover and mutation operators are designed to work with real parameters. Multi-objective Fitness: Non-dominated (best) Dominated but feasible (average) Infeasible points (worst) Genetic Operators: 1. Selection operator: in tournament selection individuals are selected at random for a tournament and the best individual is selected into the mating pool. 2. Crossover operator is inspired by the role of sexual reproduction in creating new and different individuals from two parents. The mechanism works by mixing the genes of good solution with genes from other also good solutions. Two candidate solutions with high fitness are selected randomly from the mating pool. Then crossover operator such as single point crossover is used to swap genes in order to produce new pair of offspring that are likely to be better individuals. 3. Mutation operator is inspired by the role of mutation in natural selection. With binary coded GAs, this means flipping a 0 bit to a 1 bit or vice versa. Mutation yields a new individual which mostly less fit than its parent. Nevertheless, mutation is used to prevent the GA from getting trapped in a local minimum or maximum. Hence, it is often used as a mean of diversifying converging population Multi-objective Fitness: Non-dominated (best) Dominated but feasible (average) Infeasible points (worst) Copyright © Dr. Yair Levy all rights reserved worldwide

24 Why using GA? Using a GA when the search space is large and not so well understood and unstructured. A GA can provide a surprisingly powerful heuristic search. Simple, yet it performs well on many different types of problems: optimization of functions with linear and nonlinear constraints, the traveling salesman problem, machine learning, parallel semantic networks, simulation of gas pipeline systems, problems of scheduling, web search, software testing, financial forecasting, and others. The key point in deciding whether or not to use genetic algorithms for a particular problem centers around the question: what is the space to searched? If that space is well understood and contains structure that can be exploited by special-purpose search techniques, the use of genetic algorithms is generally computationally less efficient. If the space to be searched is not so well understood and relatively unstructured, then GAs provide a surprisingly powerful search heuristic for large, complex spaces (De Jong, 1990). A problem may not have exact solution because of inherent ambiguities in the problem statement or available data. E.g. medical diagnosis: a given set of symptoms may have several causes. A problem may have an exact solution but computational cost of finding it may be prohibited. E.g. chess, state space growth is combinatorially explosive, with the number of possible states increasingly exponentially with the depth of the search. They have been employed in a wide range of applications including but not limited to: optimization of functions with linear and nonlinear constraints, the traveling salesman problem, machine learning, parallel semantic networks, simulation of gas pipeline systems, problems of scheduling, web search, software testing, and financial forecasting. Copyright © Dr. Yair Levy all rights reserved worldwide

25 Multi-Objective Evolutionary Algorithm (MOEA)
An EA is a variation of the original GA. An MOEA has additional operations to maintain multiple Pareto-optimal solutions in the population. Advantages: Deal simultaneously with a set of possible solutions. Enable of finding several members of the Pareto optimal set in a single run of the algorithm. Explore solutions over the entire search space. Less susceptible to the shape or continuity of the Pareto front. Disadvantages: Not completely supported theoretically yet (compared to another method such as Stochastic Approximation which has been around for half a century). Instead of having to perform a series of separate runs as in the case of the classical optimization methods. The algorithms can easily deal with discontinuous or concave Pareto front, whereas these two issues are real concern for the classical approaches. Stochastic approximation (SA) mimics the gradient search method from deterministic optimization, but in a rigorous statistical manner that takes into account the stochastic nature of the system model. The goal of response surface methodology (RSM) is to obtain an approximate functional relationship between the input variables and the output objective function. Simulated annealing: can be thought of as a variation of local search (for deterministic objective functions), in which the main idea is to accept all downhill (assume here a minimization problem) improving moves, but sometimes accept uphill moves, where the acceptance probability decreases to 0 at an appropriate rate. Slow in converging to good solutions. Tabu search can be thought of as a variation on local search that incorporates two main strategies: adaptive memory and responsive exploration. The features of these strategies modify the neighborhood of a solution point as the search progresses, and thus determine the effectiveness of the algorithm. In particular, the modification from which the method derives its name forbids certain points (classifying them tabu) from belonging to the current neighborhood of points being considered. Thus, for example, short-term memory can prevent the search from revisiting recently visited points, whereas longer-term memory can encourage moves that have historically led to improvements (intensification) and moves into previously unexplored regions of the search space (diversification). Copyright © Dr. Yair Levy all rights reserved worldwide

26 Multi-Objective Genetic Algorithm (MOGA)
Proposed by Fonseca and Fleming (1993). The approach consists of a scheme in which the rank of a certain individual corresponds to the number of individuals in the current population by which it is dominated. It uses fitness sharing and mating restrictions.

27 Advantages and Disadvantages of MOGA
Efficient and relatively easy to implement. Its performance depends on the appropriate selection of the sharing factor. MOGA has been very popular and tends to perform well when compared to other EMOO approaches. Some Applications Fault diagnosis Control system design Wings plan form design

28 Nondominated Sorting Genetic Algorithm (NSGA)
Proposed by Srinivas and Deb (1994). It is based on several layers of classifications of the individuals. Nondominated individuals get a certain dummy fitness value and then are removed from the population. The process is repeated until the entire population has been classified. To maintain the diversity of the population, classified individuals are shared (in decision variable space) with their dummy fitness values.

29 NSGA – Flow Chart Multi-objective Fitness: Non-dominated (best)
Dominated but feasible (average) Infeasible points (worst) Before selection is performed, the population is ranked on the basic of domination: all non-dominated individuals are classified into one category (with a dummy fitness value, which is proportional to the population size). To maintain the diversity of the population, these classified individuals are shared (in decision variable space) with their dummy fitness values. Then this group of classified individuals is removed from the population and another layer of no-dominated individuals is considered (the remainder of the population is re-classified). The process continues until all the individuals in the population are classified. Since individuals in the first front have maximum fitness value, they always get more copies than the rest of the population. This allow us to search for non-dominated regions, and results in convergence of the population toward such regions. Sharing, on its part, helps to distribute the population over this region. Before selection is performed, the population is ranked on the basic of domination: all non-dominated individuals are classified into one category (with a dummy fitness value, which is proportional to the population size). To maintain the diversity of the population, these classified individuals are shared (in decision variable space) with their dummy fitness values. Then this group of classified individuals is removed from the population and another layer of no-dominated individuals is considered (I.e., the remainder of the population is re-classified). The process continues until all the individuals in the population are classified. Since individuals in the first front have maximum fitness value, they always get more copies than the rest of the population. This allow us to search for non-dominated regions, and results in convergence of the population toward such regions. Sharing, on its part, helps to distribute the population over this region. Scatter search is also a population-based evolutionary search strategy like GAs. However, Glover, Kelly, and Laguna (1999) claim that whereas naïve GAs produce offspring through random combination of components of the parents, scatter search produces offspring more intelligently by incorporating history (i.e., past evaluations). In other words, diversity is preserved, but natural selection is used in reproduction prior to being evaluated. This is clearly more important in the simulation setting, where estimation costs are so much higher than search costs. In “Particle Swarm Optimization” (PSO) algorithm the population dynamics simulates a “bird flock’s” behavior where social sharing of information takes place and individuals can profit from the discoveries and previous experience of all other companions during the search for food. in PSO, instead of using genetic operators, each individual (particle) updates its own position based on its own search experience and other individuals (companions) experience and discoveries. Adding the velocity term to the current position, in order to generate the next position, resembles the mutation operation in evolutionary programming. Note that in PSO, however, the “mutation” operator is guided by particle’s own “flying” experience and benefits by the swarm’s “flying” experience. In another words, PSO is considered as performing mutation with a “conscience.” Copyright © Dr. Yair Levy all rights reserved worldwide

30 Demo – NSGA II [Gellert et al., 2012] Multi-Objective Optimizations for a Superscalar Architecture with Selective Value Prediction, IET Computers & Digital Techniques, Vol. 6, No. 4 (July), pp , ISSN: - Features of NSGA II Copyright © Dr. Yair Levy all rights reserved worldwide

31 The research area Problems: Research:
The so called “standard” settings (De Jong, 1990) : population size of , crossover rate of 0.6 – 0.9, and mutation rate of do not work for complex problems. For complex real-world problems, GAs require parameter tunings in order to achieve the optimal solutions. The task of tuning GA parameters is not trivial due to the complex and nonlinear interactions among the parameters and its dependency on many aspects of the particular problem being solved (e.g. density of the search space). Research: Self-Adaptive MOEA: use information fed-back from the MOEA during its execution to adjust the values of parameters attached to each individual in the population. Improve the performance of MOEA: finding wide spread Pareto-Optimal solutions and reducing computing resources. Make them easier to use and available to more users. Population size is a major factor in determining the quality of the solutions. Setting the population size not large enough will cause the GA to converge to sub-optimal solutions. On the other hand setting the population size above optimum will cause the GA to waste unnecessary computational resources. Crossover has two purposes: the main purpose is to explore the solution space; the second is to perform the search in a way to preserve the genes stored in the parental individuals maximally because the parental individuals are instances of good candidate solutions selected by reproduction operator. The higher crossover probability, the more promising solutions are mixed. This also increases the disruption of good solutions. Mutation takes a single individual and randomly changes some of its characteristics. Mutation is used to maintain diversity in the population. By itself or with a high mutation probability, mutation represents a random search, similar to an intelligent hill-climbing strategy, in the neighborhood of a particular solution, but it may also destroy already found good solutions. Our application does not perform any  portfolio optimization or risk-management optimization. It uses Latin Hypercube to generate sampling data for uncertainty and sensitivity analysis. Copyright © Dr. Yair Levy all rights reserved worldwide

32 Multi-Objective Evolutionary Algorithms - references (MOEAs)
Some representatives of MOEAs in operational research through past years: Non-Dominated Sorting genetic Algorithm (NSGA), Srinivas et Deb, 1995. NSGA-II, Deb et al., 2002. Strength Pareto Evolutionary Algorithm (SPEA), Zitzler and Thiele, 1999. SPEA2, Zitzler et al., 2001. Epsilon-NSGAII, Kollat and Reed, 2005. Multi-objective Shuffled Complex Evolution Metropolis Algorithm (MOSCEM-UA), Vrugt et al., 2003.


Download ppt "Multi-Objective Optimization Using Evolutionary Algorithms"

Similar presentations


Ads by Google