# 1 Modeling and Simulation: Exploring Dynamic System Behaviour Chapter9 Optimization.

## Presentation on theme: "1 Modeling and Simulation: Exploring Dynamic System Behaviour Chapter9 Optimization."— Presentation transcript:

1 Modeling and Simulation: Exploring Dynamic System Behaviour Chapter9 Optimization

2 A Restricted Perspective Considerations that follow are restricted to the domain of CTDS Consequently we are able to circumvent the additional complexity that arises in the DEDS domain because of its inherent stochastic nature

3 The Problem Statement An Extended CTDS Model An essential constituent of any optimization problem is a collection of real-valued parameters (denoted by the m-vector, p). These parameters are embedded within the CTDS model and hence its generic representation changes from: x(t) = f(x(t), t) to: x(t) = f(x(t), t; p)

4 The Problem Statement (Cont.) Implicit in the notion of optimization is the goal of finding a “best” value for the parameter vector p (denoted by p * ) This search is associated with a scalar real-valued criterion function, J=J(p) and the best value for p (i.e., p *) is one which yields an extreme value for J; i.e., either a maximum or a minimum (note that p * may not be unique) For definiteness we assume the latter (note however that the maximization of J(p) is the equivalent to the minimization of –J(p) Thus our goal is to find p * such that: J(p * )≤J(p) for all pεΦ where Φ is a constraint set

5 The Constraint Set The constraint set, Φ, reflects the fact that not all real-valued m-vectors are necessarily permissible candidates for p. Φ represents a set of functional constraints. These may either implicitly or explicitly restrict the values of some of the components of p. A simple explicit example is: p 1 > 0

6 The Constraint Set (cont.) In general, Φ is a set of c3 functional constraints of the form: Φ j (x(t; p)) >0 for j = 1, 2,..., c1 Φ j (x(t; p)) ≥ 0 for j = c1 + 1, c1 + 1,.., c2 Φ j (x(t; p)) = 0 for j = c2 + 1, c2 + 2,.., c3.

7 The Constraint Set (cont.) Solving the constrained problem is necessarily a more challenging task A variety of techniques exist for dealing with the added complexity; one such technique is called the Penalty Function Method With this approach the solution to the constrained problem is obtained by solving a sequence of specially formulated unconstrained problems, but no “special” minimization procedure is introduced The discussion that follows focuses on the unconstrained problem (the set Φ is empty)

8 Some Typical formats for J(p)

9 Evaluating the Criterion Function The most distinctive feature of the optimization problem within the CTDS context is the need to solve a set of differential equations each time a value of J is required. This has two immediate consequences: a) computational overhead can be high b) the precision with which the value of J is generated is subject to the quality of the differential eq’n solving process.

10 Unconstrained Minimization Methods A wide range of methods is available for the unconstrained function minimization problem These methods can be categorized in a variety of ways The most fundamental is in terms of their need for gradient information. The need for gradient information introduces an additional level of complexity in the CTDS context

11 The Nelder-Mead Simplex Method The simplex method is a member of the class of methods that do not require gradient information. These methods are based on heuristic arguments; i.e., they lack a formal (theoretical) foundation.

12 General Concept The process begins with a collection of (m+1) points (i.e. a simplex) within the m- dimension parameter space Through a set of operations, the simplex is “moved” around in the parameter space and is ultimately contracted in size as it surrounds the minimizing argument, p * When m=2, the simplex is a triangle.

13 The Simplex Procedure Choose p 0 an initial “guess” at the minimizing argument Determine m additional points based on p 0 to create an initial simplex {p 0, p 1, p 2 ---- p m } begin step: evaluate J at each of the points in the simplex Let p L be the vertex that yields the largest value for J, p G be the vertex that yields the next largest value for J and p S be the vertex that yields the smallest value for J. Correspondingly, let J L = J(p L ), J G = J(p G ) and J S = J(p S ).

14 The Simplex Procedure (cont) The centroid of the simplex with p L excluded is given by: A reflection step is carried out by “reflecting” the worst point p L about the centroid, to produce a new point p R where: p R = p C + α (p C – p L ) where α > 0 and is typically =1

15 The Simplex Procedure (cont) a reflection step:

16 The Simplex Procedure (cont) One of three possible actions now take place depending on the value of J R = J(p R ): (i) If J G > J R > J S, then p R replaces p L and the step is completed. Return to begin step (in slide 13).

17 The Simplex Procedure (cont) (ii) If J R < J S then a new “least point” has been uncovered and it is possible that further movement in the same direction could be advantageous. Consequently an expansion step is carried out to produce p E where: p E = p C + γ (p R – p C ) where γ > 1 and is typically 2

18 The Simplex Procedure (cont) An expansion step

19 The Simplex Procedure (cont) an expansion step: If J E = J(p E ) < J S then p L is replaced with p E ; otherwise p L is replaced with p R. In either case, the step is completed. Return to begin step (in slide 13)

20 The Simplex Procedure (cont) a contraction step: (iii) If J R > J G then a contraction step is made to produce the point p D where: (where 0 < β <1 and is typically 0.5) Here is either p R or p L depending on whether J R is smaller or larger than J L. If J D = J(p D ) < J G then the step ends (return to begin step – slide 13). Otherwise the simplex is shrunk about p S by halving the distances of all vertices from this point and then the step ends (return to begin step – slide 13).

21 The Simplex Procedure (cont) TERMINATION There are two possible ways for terminating the process: (1) when the separation among the points in the current simplex has become sufficiently small (2) when the variation among the criterion function values at the vertices of the current simplex is within a prescribed tolerance In either case, the vertex p S is taken to be p *

22 The Conjugate Gradient Method The conjugate gradient (CG) method is a member of the class of gradient dependent methods The CG method also has the property of quadratic convergence. This means that the minimizing argument of a quadratic function of dimension m will be located in no more than m iterations of the method This property is significant because an arbitrary function, J(p), generally has a quadratic shape around its minimizing argument Recall that the gradient of J(p) at the argument is the m- vector of partial derivatives of J taken with respective to p 1, p 1, ---- p m. and evaluated at. It is denoted by J p ( ). Recall also that J p (p*) = 0.

23 The Line Search Problem A line search problem needs to be solved at each step of the CG process Suppose is a point in m-space and u is a given m-vector of unit length which we interpret as a direction The m-vector ( + α u) can be regarded as a point in m- space reached by moving a distance α away from in the direction u The line search problem is the problem of finding a value α * for the scalar α which yields a minimum value for J( + α u). This is a one-dimensional minimization problem. The challenge of obtaining its precise solution in an efficient manner should not be underestimated!!

24 The CG Method (cont) The CG method carries out a sequence of steps which generate a sequence of estimates for p *. Two tasks are carried out at each step. (1) Generate a new estimate for p * via: p k = p k-1 + α* r k-1 where: (In other words, p k is the result of a line search from p k-1 in the direction r k-1) 2) Generate a new search direction via: r k = – J p (p k ) + β k-1 r k-1

25 The CG Method (cont) Initial step: to initiate the process there are two requirements: (a) p 0 ; this is a “best guess” of the p * (b) r 0 ; this is an initial search direction and is set equal to -J p (p 0 )

26 The CG Method (cont) A variety of possible assignments are available for β k-1. These have been proposed by different authors. The original was proposed by Fletcher and Reeves (1964):

27 Acquiring the Gradient Vector Within the CTDS context under consideration, the explicit determination of the gradient of the criterion function; i.e., J p (ω), in not feasible. A finite difference approximation of the components of J p (ω) can nevertheless be adequate for the implementation of the CG method

28 Acquiring the Gradient Vector (cont) Recall that … A finite difference approximation is: Here Δ is a suitably small positive scalar and e k is the k th column of the m x m identity matrix

29 Acquiring the Gradient Vector (cont) The construction of the m-dimensional gradient vector requires a value for J(ω + Δe k ) for k = 1, 2, --- m. Consequently the underlying differential equations of the model must be solved m times. Computational overhead can become an issue!! Note also that considerable care must be taken to ensure that the value of Δ is not unreasonably small since there is a possibility of results becoming hopelessly corrupted by numerical noise.

Download ppt "1 Modeling and Simulation: Exploring Dynamic System Behaviour Chapter9 Optimization."

Similar presentations