Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.

Similar presentations


Presentation on theme: "Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct."— Presentation transcript:

1 Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct method such as Golden Section: b a yx One dimension Two dimensions Number of function evaluations increases as e n, where n is number of dimensions.

2 The Polytope Algorithm This is a direct search method. Also known as “simplex” method. In n dimensional case, at each stage we have n+1 points x 1, x 2,…,x n+1 such that: F(x 1 )  F(x 2 )   F(x n+1 ) The algorithm seeks to replace the worst point, x n+1, with a better one. The x i lie at the vertices of an n- dimensional polytope.

3 The Polytope Algorithm 2 The new point is formed by reflecting the worst point through the centroid of the best n vertices: Mathematically the new point can be written: x r = c +  (c-x n+1 ) where  >0 is the reflection coefficient. In two dimensions polytope is a triangle; in three dimensions it is a tetrahedron.

4 Polytope Example For n = 2 we have three points at each step. x3x3 x1x1 x2x2 cc-x3c-x3 (c-x3)(c-x3) (worst point) xrxr

5 Detailed Polytope Algorithm 1.Evaluate F(x r )  F r. If F 1  F r  F n, then x r replaces x n+1. 2.If F r < F 1 then x r is new best point and we assume direction of reflection is “good” and attempt to expand polytope in that direction by defining the point, x e = c +  (x r -c) where  >1. If F e < F r then x e replaces x n+1 ; otherwise x r replaces x n+1.

6 Detailed Polytope Algorithm 2 3.If F r > F n then the polytope is too big and we attempt to contract it by defining: x c = c +  (x n+1 -c) if F r  F n+1 x c = c +  (x r -c) if F r < F n+1 where 0<  <1. If F c < min(F r,F n+1 ) then x c replaces x n+1 ; otherwise a further contraction is done.

7 MATLAB Example Polytope >> banana = @(x)10*(x(2)-x(1)^2)^2+(1-x(1))^2; >> [x,fval] = fminsearch(banana,[-1.2, 1],optimset('Display','iter'))

8 Polytope Example by Hand

9 Polytope Example Start with equilateral triangle: x 1 = (0,0)x 2 =(0,0.5) x 3 =(  3,1)/4 Take  =1,  =1.5, and  =0.5

10 Polytope Example: Step 1 Polytope is i123 xixi (0,0)(0,0.5)(0.433,0.25) F(x i )9.79187.31534.8601 Worst point is x 1, c = (x 2 + x 3 )/2 = (0.2165,0.375) Relabel points: x 3  x 1, x 1  x 3 x r = c +  (c- x 3 ) = (0.433,0.75) and F(x r )=3.6774 F(x r )< F(x 1 ) so x r is best point so try to expand. x e = c +  (x r -c) = (0.5413,0.9375) and F(x e )=3.1086 F(x e )< F(x r ) so accept expand

11 After Step 1

12 Polytope Example: Step 2 Polytope is i123 xixi (0.433,0.25)(0,0.5)(0.5413,0.9375) F(x i )4.86017.31533.1086 Worst point is x 2, c = (x 1 + x 3 )/2 = (0.4871,0.5938) Relabel points: x 3  x 1, x 2  x 3, x 1  x 2 x r = c +  (c- x 3 ) = (0.9743,0.6875) and F(x r )=2.0093 F(x r )< F(x 1 ) so x r is best point so try to expand. x e = c +  (x r -c) = (1.2179,0.7344) and F(x e )=2.2837 F(x e )>F(x r ) so reject expand.

13 After Step 2

14 Polytope Example: Step 3 Polytope is i123 xixi (0.5413,0.9375)(0.433.0.25)(0.9743,0.6875) F(x i )3.10864.86012.0093 Worst point is x 2, c = (x 1 + x 3 )/2 = (0.7578,0.8125) Relabel points: x 3  x 1, x 2  x 3, x 1  x 2 x r = c +  (c- x 3 ) = (1.0826,1.375) and F(x r )=3.1199 F(x r )>F(x 2 ) so polytope is too big. Need to contract. x c = c +  (x r -c) = (0.9202,1.0938) and F(x c )=2.2476 F(x c )<F(x r ) so accept contraction.

15 After Step 3

16 Polytope Example: Step 4 Polytope is i123 xixi (0.9743,0.6875)(0.5413,0.9375)(0.9202,1.0938) F(x i )2.00933.10862.2476 Worst point is x 2, c = (x 1 + x 3 )/2 = (0.9472,0.8906) Relabel points: x 3  x 2, x 2  x 3 x r = c +  (c-x 3 ) = (1.3532,0.8438) and F(x r )=2.7671 F(x r )>F(x 2 ) so polytope is too big. Need to contract. x c = c +  (x r -c) = (1.1502,0.8672) and F(x c )=2.1391 F(x c )<F(x r ) so accept contraction.

17 After Step 4

18 Polytope Example: Step 5 Polytope is i123 xixi (0.9743,0.6875)(0.9202,1.0938)(1.1502,0.8672) F(x i )2.00932.24762.1391 Worst point is x 2, c = (x 1 + x 3 )/2 = (1.0622,0.7773) Relabel points: x 3  x 2, x 2  x 3 x r = c +  (c- x 3 ) = (1.2043,0.4609) and F(x r )=2.6042 F(x r )  F(x 3 ) so polytope is too big. Need to contract. x c = c +  (x 3 -c) = (0.9912,0.9355) and F(x c )=2.0143 F(x c )<F(x r ) so accept contraction.

19 After Step 5

20 Polytope Example: Step 6 Polytope is i123 xixi (0.9743,0.6875)(1.1502,0.8672)(0.9912,0.9355) F(x i )2.00932.13912.0143 Worst point is x 2, c = (x 1 + x 3 )/2 = (0.9827,0.8117) Relabel points: x 3  x 2, x 2  x 3 x r = c +  (c- x 3 ) = (0.8153,0.7559) and F(x r )=2.1314 F(x r )>F(x 2 ) so polytope is too big. Need to contract. x c = c +  (x r -c) = (0.8990,0.7837) and F(x c )=2.0012 F(x c )<F(x r ) so accept contraction.

21 Polytope Example: Final Result So after 6 steps the best estimate of the minimum is x = (0.8990,0.7837) for which F(x)=2.0012.

22 Alternating Variables Method Start from point x = (a 1, a 2,…, a n ). Take first variable x 1, and minimise F(x 1, a 2,…, a n ) with respect to x 1. This gives x 1 = a 1. Take second variable x 2, and minimise F(a 1, x 2,…, a n ) with respect to x 2. This gives x 2 = a 2. Continue with each variable in turn until minimum is reached.

23 AVM in Two Dimensions Start Method of minimisation over each variable can be any univariate method.

24 AVM Example in 2D Minimise F(x,y)=x 2 +y 2 +xy-2x-4y Start at (0,0).

25 AVM Example in 2D xyF(x,y)|error| 0004 103 11.5-3.250.75 0.251.5-3.81250.1875 0.251.875-3.9530.047 0.06251.875-3.9880.012 0.06251.968-3.99710.0029 0.01561.968-3.99920.0008 0.01561.992-3.99980.0002 0.0041.992-3.999950.00005 0.0041.998-3.999990.00001

26 Definition of Gradient Vector The gradient vector is: The gradient vector is also written as  F(x).

27 Definition of Hessian Matrix The Hessian matrix is defined as: The Hessian matrix is symmetric, and is also written as  2 F(x).

28 Conditions for a Minimum of a Multivariate Function 1.|g(x*)| = 0. That is, all partial derivatives are zero. 2.G(x*) is positive definite. That is, x T G(x*)x > 0 for all vectors x  0. The second condition implies that the eigenvalues of G(x*) are strictly positive.

29 Stationary Points If g(x*)=0 then x* is said to be a stationary point. There are 3 types of stationary point: 1.Minimum, e.g., x 2 +y 2 at (0,0) 2.Maximum, e.g., 1-x 2 -y 2 at (0,0) 3.Saddle Point, e.g., x 2 -y 2 at (0,0)

30 Definition: Level Surface F(x)=constant defines a “level surface”. For different values of the constant we generate different level surfaces. For example, in 3-D suppose F(x,y,z) = x 2 /4 + y 2 /9 + z 2 /4 F(x,y,z) = constant is an ellipsoid surface centred on the origin. Thus, the level surfaces are a series of concentric ellipsoidal surfaces. The gradient vector at point x is normal to the level surface passing through x.

31 Definition: Tangent Hyperplane For a differentiable multivariate function, F, the tangent hyperplane at the point x t on the surface F(x)=constant is normal to the gradient vector.

32 Definition: Quadratic Function If the Hessian matrix of F is constant then F is said to be a quadratic function. In this case F can be expressed as: F(x) = (1/2)x T Gx + c T x +  for a constant matrix G, vector c, and scalar .  F(x) = Gx + c and  2 F(x) = G.

33 Example Quadratic Function F(x,y) = x 2 + 2y 2 + xy – x + 2y Gradient vector is zero at stationary point, so Gx + c = 0 at stationary point Need to solve Gx = -c to find stationary point: x* = G -1 c  x* = (6/7 -5/7) T

34 Hessian Matrix Again We can predict the behaviour of a general nonlinear function near a stationary point, x*, by looking at the eigenvalues of the Hessian matrix. Let u j and j denote the jth eigenvector and eigenvalue of G. If j > 0 the function will increase as we move away from x* in direction u j. If j < 0 the function will decrease as we move away from x* in direction u j. If j = 0 the function will stay constant as we move away from x* in direction u j.

35 Example Again 1 = 1.5858 and 2 = 4.4142, so F increases as we move away from the stationary point at (6/7 -5/7) T. So the stationary point is a minimum.

36 Example in 4D In MATLAB: >> c = [-1 2 3 4]’; >>G = [2 1 0 2; 1 2 -1 3; 0 -1 4 1; 2 3 1 -2]; >>x = G\(-c) >>[u,lambda] = eigs(G)

37 Descent Methods Seek a general algorithm for unconstrained minimisation of a smooth multivariate function. Require that F decreases at each iteration. A method that imposes this type of condition is called a descent method.

38 A General Descent Algorithm Let x k be current iterate. If converged then quit; x k is estimate of minimum. Compute a nonzero vector p k giving direction of search. Compute a positive scalar step length,  k for which F(x k +  k p k ) < F(x k ). New estimate of minimum is x k+1 = x k +  k p k. Increment k by 1, and go to step 1.

39 Method of Steepest Descent Direction in which F decreases most steeply is -  F, so we use this as the search direction. New iterate is x k+1 = x k -  k  F, where  k is non-negative scalar chosen so that x k+1 is the minimum point along the line from x k in the direction -  F. Thus,  k minimises F(x k -  F) with respect to .

40 Steepest Descent Algorithm Initialise: x 0, k=0 Loop: u =  F(x k ) if |u|=0 then quit else minimise h(  )=F(x k -  u) to get  k x k+1 = x k -  k u k = k+1 if (not finished) go to Loop

41 Example F(x,y) = x 3 + y 3 - 2x 2 + 3y 2 - 8  F(x,y) = 0 gives 3x 2 -4x=0 so x = 0 or 4/3; and, 3y 2 +6y=0 so y=0 or -2. (x,y)GType (0,0)IndefiniteSaddle point (0,-2)Negative definiteMaximum (4/3,0)Positive definiteMinimum (4/3,-2)IndefiniteSaddle point

42 Solve with Steepest Descent Take x 0 = (1 -1) T, then  F(x 0 )=(-1 -3) T. h(  )  F(x 0 -   F(x 0 )) = F(1+ ,-1+3  ) = (1+  ) 3 +(3  -1) 3 -2(1+  ) 2 +3(3  -1) 2 -8 Minimise h(  ) with respect to .  h/  = 3(1+  ) 2 +9(3  -1) 2 - 4(1+  )+18(3  -1) = 84  2 + 2  -10 = 0 So  = 1/3 or -5/14.  must be nonnegative so  = 1/3.

43 Solve with Steepest Descent x 1 = x 0 -  F(x 0 ) = (1 -1) T – (-1/3 -1) T = (4/3 0) T. This is the exact minimum. We were lucky that the search direction at x 0 points directly towards (4/3 0) T. Usually we would need to do more than one iteration to get a good solution.

44 Newton’s Method Approximate F locally by a quadratic function and minimise this exactly. Taylor’s Theorem: F(x)  F(x k )+(g(x k )) T (x-x k )+ (1/2)(x-x k ) T G(x k )(x-x k ) = F(x k )-(g(x k )) T x k + (1/2)x k T G(x k )x k + (g(x k )-G(x k )x k ) T x+(1/2)x T G(x k )x RHS is minimum when g(x k ) – G(x k )x k +G(x k )x k+1 =0 So x k+1 = x k – [G(x k )] -1 g(x k ) Search direction is - [G(x k )] -1 g(x k ) and step length is 1.

45 Newton’s Method Example Rosenbrock’s function: F(x,y) = 10(y-x 2 ) 2 + (1-x) 2 Use Newton’s Method starting at (-1.2 1) T.

46 MATLAB Solution >> F=@(x,y)10*(y-x^2)^2+(1-x)^2 >>fgrad1=@(x,y)-40*x*(y-x^2)-2*(1-x) >>fgrad2=@(x,y)20*(y-x^2) >>G11=@(x,y)120*x^2-40*y+2 >>x=[-1.2;1] >>x=x-inv([G11(x(1),x(2)) -40*x(1);-40*x(1) 20])*[fgrad1(x(1),x(2)) fgrad2(x(1),x(2))]’

47 MATLAB Iterations xyF(x,y) -1.216.7760 -0.97550.90123.9280 0.0084-0.967910.3533 0.05710.00090.8892 0.95730.10606.5695 0.95980.92120.0016 1.00000.99842.62  10 -5 1.0000 2.41  10 -14

48 Notes on Newton’s Method Newton’s Method converges quadratically if the quadratic model is a good fit to the objective function. Problems arise if the quadratic model is not a good fit outside a small neighbourhood of the current point.


Download ppt "Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct."

Similar presentations


Ads by Google