Presentation on theme: "Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407."— Presentation transcript:
Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407 …Taylor Series and Rotations Spring, 2008
Introduction and Notation The Problem Derivatives: A local min x * is a critical point: Necessary condition: 0
Typical Iterative Methods Sequence is generated from x 0 Such that With v k a vector with property a descent direction And p k > 0 typically approximates solution of called the line search or the scalar search Proven to converge for smooth functions
Current Methods Selecting v k has huge effect on convergence rate: –Steepest Descent: 1 st order –Newtons direction: 2 nd order, but may not be a descent direction when far from a min –Conjugate Directions uses v k-1, v k-2,... –Quasi-Newton/Variable metric also uses v k-1, v k-2,... –High order Tensor models fit prior iteration values –Number of derivatives available affects method The scalar search –Accuracy of scalar minimization –Quadratic models: Trust Region
Infinite Series of Solution Matrix vector products, but shown with exponents for connections with scalar Taylor series.
Infinite Series of Solution… Define: Then: For p = 1:
Curved Trajectories Algorithm At k th iteration, estimate, then calculate: Select order, modify d i, and select p k 2 nd order: 3 rd order: 4 th order:
Challenges High order terms accurately approximated from the Gradient and the Hessian Scalar searches along polynomial curved trajectories Performance for large problems –Exploit Sparse Hessian Store nonzeros only, no operations on zeros Far from solution: –Hessian not positive definite (solved) Hessian modified and use CG step as last resort
Rotations At point we have h ( p ) is trajectory and R (θ) is rotation matrix. h (0) = 0 and R (0) = I, and for 2 coordinates, counterclockwise At the k th step far from solution we want: But settle for p k, θ k :
Rotations (continued) Gives Trajectory angle with the gradient for R (0) = I Observations:
Rotation Challenges/Results Select effective θ k without too much work –Using existing strategy to calculate p k, then calculate a θ k from θ * and θ G. Then calculate a new p k again using rotated trajectory. –Good results with –θ k > 40º indicates elongated ellipse contours, and rotation seems unproductive in this case. –Effective when CTA series is convergent and iteration is not close to the minimum point. Functions of more than 2 variables later
More than Two Coordinates Ignore coordinates with insignificant Newton correction magnitudes. Success achieved by adding the 3rd coordinate to the first two as follows: –Calculate the rotation by paring the 3 rd coordinate with each of the top 2 coordinates. –This results in a rotation matrix: –Where the angles θ 1, θ 2, θ 3 are each calculated between two coordinates as explained before. The 4 th coordinate is added by pairing rotations with the first 3 coordinates, and so on.