Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms.

Similar presentations


Presentation on theme: "Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms."— Presentation transcript:

1 Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms

2 2 Outline  First-order Necessary Condition  Examples of Unconstrained Problems  Second-order Conditions  Convex and Concave Functions  Minimization and Maximization of Convex Functions  Global Convergence of Descent Algorithms  Speed of Convergence

3 3 Introduction  Considering optimization problem of the form where f is a real-valued function and Ω, the feasible set, is a subset of E n. where f is a real-valued function and Ω, the feasible set, is a subset of E n.

4 4 Weierstras Theorem  If f is continuous and Ω is compact, a solution point of minimization problem exists.

5 5 Two Kinds of Solution Points  Definition of a relative minimum point –A pointis said to be a relative minimum point of f over Ω if there is an such that f(x) ≥ f(x * ) for all within a distance of x *. If f(x) > f(x * ) for all, x≠x *, within a distance of x *, then x * is said to be a strict relative minimum point of f over.  Definition of a global minimum point –A pointis said to be a global minimum point of f over Ω if f(x) ≥ f(x * ) for all. If f(x) > f(x * ) for all, x≠x *, then x * is said to be a strict global minimum point of f over.

6 6 Two Kinds of Solution Points (cont’d)  We can achieve relative minimum by using differential calculus or a convergent stepwise procedure.  Global conditions and global solutions can, as a rule, only be found if the problem possesses certain convexity properties that essentially guarantee that any relative minimum is a global minimum.

7 7 Feasible Directions  To derive necessary conditions satisfied by a relative minimum point x *, the basic idea is to consider movement away from the point in some given direction.  A vector d is a feasible direction at x if there is an such that for all α,.

8 8 Feasible Directions (cont’d)  Proposition 1 ( first-order necessary conditions) –Let Ω be a subset of E n and let be a function on Ω. If x* is a relative minimum point of f over Ω, then for any that is a feasible direction at x*, we have Proof : Proof :

9 9 Feasible Directions (cont’d)  Corollary ( unconstrained case ) –Let Ω be a subset of E n and let be a function on Ω. If x* is a relative minimum point of f over Ω and x* is an interior point of Ω, then. Since in this case d can be any direction from x*, and hence for all. This implies. for all. This implies.

10 10 Example 1 about Feasible Directions  Example 1 ( unconstrained ) minimize minimize There are no constrains, so Ω = E n There are no constrains, so Ω = E n These have the unique solution x 1 =1, x 2 =2, so it is a global minimum of f.

11 11 Example 2 about Feasible Directions  Example 2 (a constrained case) minimize subject to x 1 ≥ 0, x 2 ≥ 0. since we know that there is a global minimum at x 1 =1/2, x 2 =0, then

12 12 Example 2 (cont’d) x1x1 x2x2 0 (1/2, 0) 1 3/22 1/2 1 3/2 d = ( d 1, d 2 ) Feasible region

13 13 Example 3 of Unconstrained Problems  The problem faced by an electric utility when selecting its power-generating facilities. Its power-generating requirements are summarized by a curve, h(x), as shown in Fig.6.2(a), which shows the total hours in a year that a power level of at least x is required for each x. For convenience the curve is normalized so that the upper limit is unity. The power company may meet these requirements by ins- talling generating equipment, such as (1) nuclear or (2) coal-fired, or by purchasing power from a central energy.

14 14 Example 3 (cont’d)  Associated with type i ( i = 1,2 ) of generating equipment is a yearly unit capital cost b i and a unit operating cost c i. The unit price of power purchased from the grid is c 3. The requirements are satisfied as shown in Fig.6.2(b), where x 1 and x 2 denote the capacities of the nuclear and coal-fired plants, respectively.

15 15 Example 3 (cont’d) Fig.6.2 Power requirement curve hours required h(x) power (megawatts) 1 (a) hours required h(x) power (megawatts) 1 (b) x1x1 x2x2 coal nuclea r purchas e

16 16 Example 3 (cont’d)  The total cost is  And the company wishes to minimize the set defined by x 1 ≥ 0, x 2 ≥ 0, x 1 + x 2 ≤ 1 x 1 ≥ 0, x 2 ≥ 0, x 1 + x 2 ≤ 1

17 17 Example 3 (cont’d)  Assume that the solution is interior to the constraints, by setting the partial derivatives equal to zero, we obtain the two equations which represent the necessary conditions  In addition, If x 1 =0, then equality (1) relax to ≥ 0 If x 2 =0, then equality (2) relax to ≥ 0

18 18 Second-order Conditions  Proposition 1 ( second-order necessary conditions ) –Let Ω be a subset of E n and let be a function on Ω. if x* is a relative minimum point of f over Ω, then for any which is a feasible direction at x* we have i) ii) if, then Proof : Proof : The first condition is just propotion1, and the second condition applies only if.

19 19 Proposition 1 (cont’d)  Proof (cont’d) :

20 20 Example 1 about Proposition 1  For the same problem as Example 2 of Section 6.1, we have d = (d 1, d 2 ) Example 2 of Section 6.1Example 2 of Section 6.1  Thus condition (ii) of Proposition 1 applies only if d 2 = 0. In that case we have so condition (ii) is satisfied. so condition (ii) is satisfied.

21 21 Proposition 2  Proposition 2 (unconstrained case) –Let x* be an interior point of the set Ω, and suppose x* is a relative minimum point over Ω of the function. Then i) ii) for all d,. –It means that F(x*), simplified notation of, is positive semi-definite.

22 22 Example 2 about Proposition 2  Consider the problem –If we assume the solution is in the interior of the feasible set, that is, if x 1 > 0, x 2 > 0, then the first-order necessary conditions are

23 23 Example 2 (cont’d)  Boundary solution is x 1 = x 2 = 0  Another solution at x 1 = 6, x 2 = 9 –If we fixed x 1 at x 1 = 6, then the relative minimum with respect to x 2 at x 2 = 9. –Conversely, with x 2 fixed at x 2 = 9, the objective attains a relative minimum w.r.t. x 1 at x 1 = 6. –Despite this fact, the point x 1 = 6, x 2 = 9 is not a relative minimum point, because the Hessian matrix F(x*) x1 = 6, x2 = 9 is not a positive semi-definite since its determinant is negative.

24 24 Sufficient Conditions for a Relative Minimum  We give here the conditions that apply only to unconstrained problems, or to problems where the minimum point is interior to the feasible solution.  Since the corresponding conditions for problems where the minimum is achieved on a boundary point of the feasible set are a good deal more difficult and of marginal practical or theoretical value.  A more general result, applicable to problems with functional constrains, is given in Chapter10.

25 25 Proposition 3  Proposition 3 (2-order sufficient conditions-unconstrained case) –Let be a function defined on a region in which the point x* is an interior point. Suppose in addition that i) ii) is positive definite Then x* is a strict relative minimum point of f. Proof : since is positive definite, there is an a > 0 such that since is positive definite, there is an a > 0 such that for all d,. Thus by Taylor’s Theorem for all d,. Thus by Taylor’s Theorem

26 26 Convex Functions  Definition –A function f defined on a convex set Ω is said to be convex. If, for every and every α, 0 ≤ α ≤ 1, there holds If, for every α, 0 < α < 1, and x 1 ≠ x 2, there holds then f is said to be strictly convex.

27 27 Concave Functions  Definition –A function g defined on a convex set Ω is said to be concave if the function f = -g is convex. The function g is strictly concave if -g is strict convex.

28 28 Graphs of Strict Convex Function f x1x1 x2x2 α x 1 +(1-α) x 2

29 29 Graphs of Convex Function f x1x1 x2x2 α x 1 +(1-α) x 2

30 30 Graphs of Concave Function f x1x1 x2x2 α x 1 +(1-α) x 2

31 31 Graph of Neither Convex or Concave f x1x1 x2x2

32 32 Combinations of Convex Functions  Proposition 1 –Let f 1 and f 2 be convex function on the convex set Ω. Then the f 1 + f 2 is convex on Ω.  Proposition 2 –Let f be a convex function over the convex set Ω. Then a f is convex for any a ≥ 0.

33 33 Combinations (cont’d)  Through the above two propositions it follows that a positive combination a 1 f 1 + a 1 f 2 + …a m f m of is again convex.

34 34 Convex Inequality Constrains  Proposition 3 –Let f be a convex function on a convex set Ω. The set is a convex for every real number c.

35 35 Proof –Let, then and for 0 <α< 1, Thus

36 36 Properties of Differentiable Convex Functions  Proposition 4 –Let, then f is convex over a convex set Ω if and only if for all

37 37 Recall  The original definition essentially states that linear interpolation between two points overestimates the function,  while here stating that linear approximation based on the local derivative underestimates the function.

38 38 Recall (cont’d) f is a convex function between two points f x1x1 x2x2 α x 1 +(1-α) x 2

39 39 Recall (cont’d) f(y) x y

40 40 Two Continuously Differentiable Functions  Proposition 5 –Let, then f is convex over a convex set Ω containing an interior point if only if the Hessian matrix F of f is positive semi-definite through Ω.

41 41 Proof  By Taylor’s theorem we have for some α, 0 ≤ α ≤ 1. if the Hessian is everywhere positive semi-definite, we have

42 42 Minimization and Maximization of Convex Functions  Theorem 1 –Let f be a convex function defined on the convex set Ω, then the setwhere f achieves its minimum is convex, and any relative minimum of f is a global minimum.  Proof (contradiction)

43 43 Minimization and Maximization of Convex Functions (cont’d)  Theorem 2 –Let be a convex on the convex set Ω. If there is a point such that, for all then x* is a global minimum point of f over Ω.  Proof

44 44 Minimization and Maximization of Convex Functions (cont’d)  Theorem 3 –Let f be a convex function defined on the bounded, closed convex set Ω. If f has a maximum over Ω, then it is achieved at an extreme point of Ω.

45 45 Global Convergence of Descent Algorithms  A good portion of the remainder of this book is devoted to presentation and analysis of various algorithms designed to solve nonlinear programming problems. However, they have the common heritage of all being iterative descent algorithms.  Iterative –The algorithm generated a series of points, each point being calculated on the basis of the points preceding it.  Descent –As each new point is generated by the algorithm the corresponding value of some function (evaluated at the most recent point) decreases in values.

46 46 Global Convergence of Descent Algorithms (cont’d)  Globally convergent –If for arbitrary starting points the algorithm is guaranteed to generate a sequence of points converging to a solution, then the algorithm is said to be globally convergent.

47 47 Algorithm and Algorithmic Map  We formally define an algorithm A as a mapping taking points in a space X into other points in X, then the generated sequence { x k } defined by –With this intuitive idea of an algorithm in mind, we now generalize the concept somewhat so as to provide greater flexibility in our analysis.  Definition –An algorithm A is a mapping defined on a space X that assigns to every point a subset of X.

48 48 Mappings  Given the algorithm yields A(x k ) which is a subset of X. From this subset an arbitrary element x k+1 is selected. In this way, given an initial point x 0, the algorithm generates sequences through the iteration  The most important aspect of the definition is that the mapping A is a point-to-set mapping of X.

49 49 Example 1  Suppose for x on the real line we define so that A(x ) is an interval of the real line. Starting at x 0 = 100, each of the sequences below might be generated from iterative application of this algorithm. 100, 50, 25, 12, -6, -2, 1, 1/2,… 100, -40, 20, -5, -2, 1, 1/4, 1/8,… 100, 10, 1/16, 1/100, -1/1000,…

50 50 Descent  Definition –Let be a given solution set and let A be an algorithm on X. A continuous real-valued functions Z on X is said to be a descent function for and A if it satisfies

51 51 Closed Mapping  Definition –A point-to-set mapping A from X to Y is said to be closed at if the assumptions The point-to-set map A is said to be closed on X if it is closed at each point of X.

52 52 Closed Mapping in Different Space  Many complex algorithms are regards as the composition of two or more simple point-to-point mappings.  Definition –Let be point-to-set mappings. The composite mapping C = BA is defined as the point- to-set mapping The definition is illustrated in Fig. 6.6

53 53 Fig. 6.6 A B C X Y Z A(x) B(x) C(x) x y

54 54 Corollaries of Closed Mapping  Corollary 1  Corollary 2

55 55 Global Convergence Theorem  The Global Convergence Theorem is used to establish convergence for the following situation.  There is a solution set. Points are generated according to the algorithm, and each new point always strictly decreases a descent function Z unless the solution set is reached.  For example, in nonlinear programming, the solution set may be the set of minimum points (perhaps only one point), and the descent function may be the objective function itself.

56 56 Global Convergence Theorem (cont’d)  Let A be an algorithm on X, and suppose that, given x 0 the sequence is generated satisfying let a solution set be given, and suppose Then the limit of any convergence subsequence of {x k } is a solution

57 57 Global Convergence Theorem (cont’d)  Corollary –If under the conditions of the Global Convergence Theorem consists of a single point, then the sequence {x k } converges to.

58 58 Examples to Illustrate Conditions  Examples 4 –On the real line consider the point-to-point algorithm and solution set,descent function The condition (iii) does not holds since A is not closed at x = 1, so, the limit of any convergent subsequence of {x k } is not a solution.

59 59 Examples (cont’d)  Example 5 –On the real line X consider the solution set, the descent function Z(x) = e -x, and the algorithm A(x) = x + 1. All the conditions of the convergence theorem holds except (i) since the sequence generated from any starting point x 0 diverges to infinity. It means S is not a compact set.

60 60 6.7 Order of convergence  Definition –Let the sequence converge to the limit r*. The order of convergence of {r k } is defined as the supremum of the nonnegative numbers p satisfying –Larger values of the order p imply faster convergence.

61 61 Order of convergence (cont’d) –If the sequence has order p and the limit exists, then asymptotically we have

62 62 Examples  Example 1 –The sequence with r k = a k where 0 < a < 1 Solution –It converges to zero with order unity, since

63 63 Examples (cont’d)  Example 2 –The sequence with for 0 < a < 1 Solution –It converges to zero with order two, since

64 64 Linear convergence  Most algorithm discussed in this book have an order of convergence equal to unity.  Definition –If the sequence {r k } converges to r * in such way that the sequence is said to converge linearly to r * with convergence ratio

65 65 Linear convergence (cont’d)  The linearly convergence is sometimes referred to as geometric convergence.  Since one with convergence ratio can be said to have a tail that converges at least as fast as the geometric sequence for some constant c.  The smaller the ratio the faster the rate.  The ultimate case where is referred to as superlinear convergence.

66 66 Examples  Example 3 –The sequence with r k = 1/k Solution –It converges to zero. The convergence is of order one but it is not linear, since is not less than one.

67 67 Examples (cont’d)  Example 4 –The sequence with r k = (1/k) k Solution –The sequence is of order unity, since for p > 1. for p > 1.However, and hence this is superlinear convergence.

68 68 Average rates  All the definitions given above can be referred to as step-wise concepts of convergence, since they define bounds on the progress made by going a single step : from k to k+1.  Another approach is to define concepts related to the average progress per step over a large number of steps.

69 69 Average rates (cont’d)  Definition –Let the sequence {r k } converge to r *. The average order of convergence is the infimum of the numbers p > 1 such that The order is infinity if the equality holds for no p > 1

70 70 Examples  Example 5 –For the sequence, 0 < a < 1 Solution –for p = 2, we have, for p > 2, we have Thus the average order is two

71 71 Examples (cont’d)  Example 6 –For the sequence r k = a k, with 0 < a < 1 Solution –for p > 1,we have Thus the average order is unity.

72 72 Convergence ratio by average method  The most important case is that of unity order.  We define the average convergence ratio as  Thus for the geometric sequence r k = ca k, for 0 < a < 1 the average convergence ratio is a.


Download ppt "Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms."

Similar presentations


Ads by Google