Performance Surfaces.

Slides:



Advertisements
Similar presentations
Optimization.
Advertisements

Chapter 9: Vector Differential Calculus Vector Functions of One Variable -- a vector, each component of which is a function of the same variable.
ESSENTIAL CALCULUS CH11 Partial derivatives
1 OR II GSLM Outline  some terminology  differences between LP and NLP  basic questions in NLP  gradient and Hessian  quadratic form  contour,
Introducción a la Optimización de procesos químicos. Curso 2005/2006 BASIC CONCEPTS IN OPTIMIZATION: PART II: Continuous & Unconstrained Important concepts.
Engineering Optimization
1 OR II GSLM Outline  classical optimization – unconstrained optimization  dimensions of optimization  feasible direction.
Lecture 8 – Nonlinear Programming Models Topics General formulations Local vs. global solutions Solution characteristics Convexity and convex programming.
D Nagesh Kumar, IIScOptimization Methods: M2L2 1 Optimization using Calculus Convexity and Concavity of Functions of One and Two Variables.
Empirical Maximum Likelihood and Stochastic Process Lecture VIII.
Linear Transformations
Symmetric Matrices and Quadratic Forms
Performance Optimization
Function Optimization Newton’s Method. Conjugate Gradients
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
18 1 Hopfield Network Hopfield Model 18 3 Equations of Operation n i - input voltage to the ith amplifier a i - output voltage of the ith amplifier.
6 1 Linear Transformations. 6 2 Hopfield Network Questions.
Function Optimization. Newton’s Method Conjugate Gradients Method
D Nagesh Kumar, IIScOptimization Methods: M2L3 1 Optimization using Calculus Optimization of Functions of Multiple Variables: Unconstrained Optimization.
5.1 Orthogonality.
9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
CHAPTER SIX Eigenvalues
8.1 Vector spaces A set of vector is said to form a linear vector space V Chapter 8 Matrices and vector spaces.
CHAP 0 MATHEMATICAL PRELIMINARY
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
Fitting a line to N data points – 1 If we use then a, b are not independent. To make a, b independent, compute: Then use: Intercept = optimally weighted.
1 Unconstrained Optimization Objective: Find minimum of F(X) where X is a vector of design variables We may know lower and upper bounds for optimum No.
Chapter 5 Eigenvalues and Eigenvectors 大葉大學 資訊工程系 黃鈴玲 Linear Algebra.
Matrix Differential Calculus By Dr. Md. Nurul Haque Mollah, Professor, Dept. of Statistics, University of Rajshahi, Bangladesh Dr. M. N. H. MOLLAH.
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Lecture 1: Math Review EEE2108: Optimization 서강대학교 전자공학과 2012 학년도 1 학기.
Chapter 6 Eigenvalues. Example In a certain town, 30 percent of the married women get divorced each year and 20 percent of the single women get married.
Optimality Conditions for Unconstrained optimization One dimensional optimization –Necessary and sufficient conditions Multidimensional optimization –Classification.
Signal & Weight Vector Spaces
Performance Surfaces.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Introduction to Optimization
A function is a rule f that associates with each element in a set A one and only one element in a set B. If f associates the element b with the element.
Inequality Constraints Lecture 7. Inequality Contraints (I) n A Review of Lagrange Multipliers –As we discussed last time, the first order necessary conditions.
10 1 Widrow-Hoff Learning (LMS Algorithm) ADALINE Network  w i w i1  w i2  w iR  =
D Nagesh Kumar, IISc Water Resources Systems Planning and Management: M2L2 Introduction to Optimization (ii) Constrained and Unconstrained Optimization.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
L6 Optimal Design concepts pt B
Continuum Mechanics (MTH487)
Matrices and vector spaces
Eigenvalues and Eigenvectors
Computational Optimization
COORDINATE TRANSFORMATION
Widrow-Hoff Learning (LMS Algorithm).
Properties Of the Quadratic Performance Surface
Quantum Two.
Lecture 8 – Nonlinear Programming Models
Quadratic Forms and Objective functions with two or more variables
Hopfield Network.
Optimization Part II G.Anuradha.
L5 Optimal Design concepts pt A
PRELIMINARY MATHEMATICS
PRELIMINARY MATHEMATICS
Outline Unconstrained Optimization Functions of One Variable
Linear Transformations
Principal Component Analysis
Performance Optimization
More Nonlinear Functions and Equations
Section 3: Second Order Methods
Linear Transformations
Convex and Concave Functions
L7 Optimal Design concepts pt C
Multivariable optimization with no constraints
Presentation transcript:

Performance Surfaces

Taylor Series Expansion F x ( ) * d = – + 1 2 - ¼ n !

Example Taylor series of F(x) about x* = 0 : Taylor series approximations:

Plot of Approximations

Vector Case F x ( ) * 1 ¶ = – 2 + ¼ n -

Matrix Form Gradient Hessian x x x x x x x x x x x x x x x x * F ( ) = + Ñ F ( x ) ( x – x * ) x x * = 1 T + - - - ( x – x * ) Ñ 2 F ( x ) ( x – x * ) + ¼ 2 x x * = Gradient Hessian F x ( ) Ñ 2 1 ¶ ¼ n = F x ( ) Ñ 1 ¶ 2 ¼ n =

Directional Derivatives First derivative (slope) of F(x) along xi axis: (ith element of gradient) Second derivative (curvature) of F(x) along xi axis: (i,i element of Hessian) p T F x ( ) Ñ - First derivative (slope) of F(x) along vector p: T p Ñ 2 F ( x ) p Second derivative (curvature) of F(x) along vector p: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - p 2

Example p T F x ( ) Ñ - 1 – 2 =

Plots Directional Derivatives 1.4 1.3 x2 1.0 0.5 0.0 x2 x1 x1

Minima Strong Minimum The point x* is a strong minimum of F(x) if a scalar d > 0 exists, such that F(x*) < F(x* + Dx) for all Dx such that d > ||Dx|| > 0. Global Minimum The point x* is a unique global minimum of F(x) if F(x*) < F(x* + Dx) for all Dx ° 0. Weak Minimum The point x* is a weak minimum of F(x) if it is not a strong minimum, and a scalar d > 0 exists, such that F(x*) Š F(x* + Dx) for all Dx such that d > ||Dx|| > 0.

Scalar Example Strong Maximum Strong Minimum Global Minimum

Vector Example

First-Order Optimality Condition x ( ) * D + Ñ T = 1 2 - ¼ For small Dx: If x* is a minimum, this implies: If then But this would imply that x* is not a minimum. Therefore Since this must be true for every Dx,

Second-Order Condition If the first-order condition is satisfied (zero gradient), then A strong minimum will exist at x* if for any Dx ° 0. Therefore the Hessian matrix must be positive definite. A matrix A is positive definite if: for any z ° 0. This is a sufficient condition for optimality. A necessary condition is that the Hessian matrix be positive semidefinite. A matrix A is positive semidefinite if: for any z.

Example x F ( ) + = (Not a function of x in this case.) 1 2 + = (Not a function of x in this case.) To test the definiteness, check the eigenvalues of the Hessian. If the eigenvalues are all greater than zero, the Hessian is positive definite. Both eigenvalues are positive, therefore strong minimum.

Quadratic Functions Gradient and Hessian: (Symmetric A) Useful properties of gradients: Gradient of Quadratic Function: Hessian of Quadratic Function:

Eigensystem of the Hessian Consider a quadratic function which has a stationary point at the origin, and whose value there is zero. Perform a similarity transform on the Hessian matrix, using the eigenvalues as the new basis vectors. Since the Hessian matrix is symmetric, its eigenvectors are orthogonal. A ' B T [ ] l 1 ¼ 2 n L =

Second Directional Derivative p T F x ( ) Ñ 2 - A = Represent p with respect to the eigenvectors (new basis): p T A 2 - c B L ( ) l i 1 = n å l m i n p T A 2 - a x £

Eigenvector (Largest Eigenvalue) ¼ c B T p z m a x 1 = z m a x T A 2 - l i c 1 = n å The eigenvalues represent curvature (second derivatives) along the eigenvectors (the principal axes).

(Any two independent vectors in the plane would work.) Circular Hollow (Any two independent vectors in the plane would work.)

Elliptical Hollow

Elongated Saddle F x ( ) 1 4 - 2 – 3 T 0.5 1.5 =

Stationary Valley F x ( ) 1 2 - – + T =

Quadratic Function Summary If the eigenvalues of the Hessian matrix are all positive, the function will have a single strong minimum. If the eigenvalues are all negative, the function will have a single strong maximum. If some eigenvalues are positive and other eigenvalues are negative, the function will have a single saddle point. If the eigenvalues are all nonnegative, but some eigenvalues are zero, then the function will either have a weak minimum or will have no stationary point. If the eigenvalues are all nonpositive, but some eigenvalues are zero, then the function will either have a weak maximum or will have no stationary point. Stationary Point: