CS5321 Numerical Optimization

CS5321 Numerical Optimization
04 Trust Region Methods 9/5/2019

Trust region method Solve model problem mk. Let pk be the solution.
Evaluate pk and update the trust region. The quadratic model will be used. fk = f (xk), Bk is the Hessian, and gk is the gradient. k is the trust region radius m i n k p ( ) = f + g T 1 2 B 9/5/2019

Solve model problem mk The problem is a constrained optimization
If ||B−1g|| and B is positive definite, p=−B−1g. Otherwise, the direction varies for different . m i n k p ( ) = f + g T 1 2 B 9/5/2019

B+I is positive semidefinite
Optimal conditions p* is the optimal solution if and only if it satisfies (B+I) p*=−g ( −||p*||)=0 B+I is positive semidefinite   0. is called Largrangian modifier (chap 12) Assume ||B−1g||k. for   0. Define ()=||−(B+I)−1g ||− and solve ()=0 This is a univariable nonlinear equation. (chap 11) 9/5/2019

Approximation methods
The problem ()=||−(B+I)−1g ||−=0 is difficult to formulate and solve Can be used when the number of variables is small. Four approximation methods Cauchy point The dogleg method Two-dimensional subspace minimization Steihaug’s algorithm (chap 7) 9/5/2019

1. Cauchy point The steepest descent dir (gradient + line search)
The solution is (Cauchy point) Slow convergence Easy to compute Use as a reference direction More on this in chap 5 p k = g k = 1 i f g T B m n ( 3 ) ; o t h e r w s : -g pC: Cauchy point 9/5/2019

2. Dogleg method Require Bk to be positive definite.
Use p() to approximate the optimal trajectory pU is the Cauchy point pB is the Newton’s direction p ( ) = U 1 + B < 2 pU pB-pU optimal trajectory p U = g T B p B = 1 g 9/5/2019

3. 2-dim subspace minimization
Use the linear combination of g and B−1g Matrix B can be indefinite. Find  such that (B + I) is positive definite If || (B +  I)−1g ||  k, p = −(B +  I)−1g + v such that vT(B +  I)−1g  0. Otherwise, p  span{g, (B +  I)−1g} m i n k p ( ) = f + g T 1 2 B s . t a ; 9/5/2019

Evaluation function Solution pk is evaluated by
Numerator f (xk)−f (xk+pk) is the actual reduction Denominator mk(0)−m(pk) is the predicted reduction Ifk is close to 1, mk is a good model. In this case, if ||pk||=k, increase k. If k is close to 0 or negative, shrink the range of the trust region k. k = f ( x ) + p m 9/5/2019

Scaling Poor scaled problems are sensitive to certain directions
Ex: f (x, y) = 109x2+y2. Solution 1: if knowing the scale of final solution, one can rescale the variables Solution 2: The trust region can be elliptical. D is a diagonal matrix m i n k D p ( ) = f + g T 1 2 B 9/5/2019

Global convergence Theorem 4.5
Suppose ||Bk|| is bounded, and f is bounded below on the level set S={x| f (x)  f (x0)} and Lipschitz continuously differentiable in the neighborhood of S. If and ||pk|| k for some c1(0,1] and  1. Then m k ( ) p c 1 g i n ; B l i m n f k ! 1 g = 9/5/2019

CS5321 Numerical Optimization

Similar presentations

Presentation on theme: "CS5321 Numerical Optimization"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS5321 Numerical Optimization

Similar presentations

Presentation on theme: "CS5321 Numerical Optimization"— Presentation transcript:

Similar presentations

About project

Feedback