296.3Page 1 296.3:Algorithms in the Real World Linear and Integer Programming II – Ellipsoid algorithm – Interior point methods.

Slides:



Advertisements
Similar presentations
Optimization.
Advertisements

Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 123 “True” Constrained Minimization.
Optimization Introduction & 1-D Unconstrained Optimization
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
x – independent variable (input)
Function Optimization Newton’s Method. Conjugate Gradients
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
Nonlinear Optimization for Optimal Control
1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.
Introduction to Linear and Integer Programming Lecture 7: Feb 1.
Approximation Algorithms
Gradient Methods May Preview Background Steepest Descent Conjugate Gradient.
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
Optimization Methods One-Dimensional Unconstrained Optimization
Support Vector Machines
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Unconstrained Optimization Problem
Lecture 17 Today: Start Chapter 9 Next day: More of Chapter 9.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Function Optimization. Newton’s Method Conjugate Gradients Method
Advanced Topics in Optimization
Polyhedral Optimization Lecture 3 – Part 3 M. Pawan Kumar Slides available online
ISM 206 Lecture 6 Nonlinear Unconstrained Optimization.
Optimization Methods One-Dimensional Unconstrained Optimization
Optimization of Linear Problems: Linear Programming (LP) © 2011 Daniel Kirschen and University of Washington 1.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’
Decision Procedures An Algorithmic Point of View
C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.
Computational Geometry Piyush Kumar (Lecture 5: Linear Programming) Welcome to CIS5930.
Linear Programming Piyush Kumar. Graphing 2-Dimensional LPs Example 1: x y Feasible Region x  0y  0 x + 2 y  2 y  4 x  3 Subject.
ENCI 303 Lecture PS-19 Optimization 2
Paper review of ENGG*6140 Optimization Techniques Paper Review -- Interior-Point Methods for LP for LP Yanrong Hu Ce Yu Mar. 11, 2003.
Pareto Linear Programming The Problem: P-opt Cx s.t Ax ≤ b x ≥ 0 where C is a kxn matrix so that Cx = (c (1) x, c (2) x,..., c (k) x) where c.
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
1 Prune-and-Search Method 2012/10/30. A simple example: Binary search sorted sequence : (search 9) step 1  step 2  step 3  Binary search.
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
WOOD 492 MODELLING FOR DECISION SUPPORT Lecture 3 Basics of the Simplex Algorithm.
15-853Page :Algorithms in the Real World Linear and Integer Programming I – Introduction – Geometric Interpretation – Simplex Method.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
Linear Programming Maximize Subject to Worst case polynomial time algorithms for linear programming 1.The ellipsoid algorithm (Khachian, 1979) 2.Interior.
Chapter 4 Sensitivity Analysis, Duality and Interior Point Methods.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
Nonlinear Programming In this handout Gradient Search for Multivariable Unconstrained Optimization KKT Conditions for Optimality of Constrained Optimization.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
Linear Programming Interior-Point Methods D. Eiland.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
OR Chapter 4. How fast is the simplex method  Efficiency of an algorithm : measured by running time (number of unit operations) with respect to.
Linear Programming Piyush Kumar Welcome to CIS5930.
Approximation Algorithms based on linear programming.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
-- Interior-Point Methods for LP
Algorithms in the Real World
Linear Programming Many problems take the form of maximizing or minimizing an objective, given limited resources and competing constraints. specify the.
Lap Chi Lau we will only use slides 4 to 19
Topics in Algorithms Lap Chi Lau.
Non-linear Minimization
Computational Optimization
The minimum cost flow problem
Perturbation method, lexicographic method
Linear Programming.
Chapter 6. Large Scale Optimization
Chap 3. The simplex method
15-853:Algorithms in the Real World
Chapter 6. Large Scale Optimization
Presentation transcript:

296.3Page :Algorithms in the Real World Linear and Integer Programming II – Ellipsoid algorithm – Interior point methods

296.3Page 2 Ellipsoid Algorithm First known-to-be polynomial-time algorithm for linear programming (Khachian 79) Solves find x subject to Ax ≤ b i.e., find a feasible solution Run Time: O(n 4 L), where L = #bits to represent A and b Problem in practice: always takes this much time.

296.3Page 3 Reduction from general case To solve: maximize: z = c T x subject to: Ax ≤ b, x ≥ 0 Convert to: find: x, y subject to: Ax ≤ b, -x ≤ 0 -A T y ≤ –c -y ≤ 0 -c T x +y T b ≤ 0 Could add constraint -c T x ≤ -z 0, do binary search over various values of z 0 to find feasible solution with maximum z, but approach based on dual gives direct solution.

296.3Page 4 Ellipsoid Algorithm Consider a sequence of smaller and smaller ellipsoids each with the feasible region inside. For iteration k: c k = center of E k Eventually c k has to be inside of F, and we are done. ckck F Feasible region

296.3Page 5 Ellipsoid Algorithm find smallest ellipsoid that includes constraint To find the next smaller ellipsoid: find most violated constraint a k ckck F Feasible region akak

296.3Page 6 Interior Point Methods Travel through the interior with a combination of 1.An optimization term (moves toward objective) 2.A centering term (keeps away from boundary) Used since 50s for nonlinear programming. Karmakar proved a variant is polynomial time in 1984 x1x1 x2x2

296.3Page 7 Methods Affine scaling: simplest, but no known time bounds Potential reduction: O(nL) iterations Central trajectory: O(n 1/2 L) iterations The time for each iteration involves solving a linear system so it takes polynomial time. The “real world” time depends heavily on the matrix structure.

296.3Page 8 Example times Central trajectory method (Lustic, Marsten, Shanno 94) Time depends on Cholesky non-zeros (i.e., the “fill”) fuelcontinentcarinitial size (K)13x31K9x57K43x107K19x12K non-zeros186K189K183K80K iterations time (sec) Cholesky non-zeros 1.2M.3M.2M6.7M

296.3Page 9 Assumptions We are trying to solve the problem: minimize z = c T x subject to Ax = b x ≥ 0

296.3Page 10 Outline 1.Centering Methods Overview 2.Picking a direction to move toward the optimal 3.Staying on the Ax = b hyperplane (projection) 4.General method 5.Example: Affine scaling 6.Example: potential reduction 7.Example: log barrier

296.3Page 11 Centering: option 1 The “analytical center”: Minimize: y = -  i=1 n lg x i y goes to infinity as x approaches any boundary. x1x1 x2x2 x4x4 x3x3 x5x5

296.3Page 12 Centering: option 2 Elliptical Scaling: (c 1,c 2 ) Dikin Ellipsoid The idea is to bias spaced based on the ellipsoid. More on this later.

296.3Page 13 Finding the Optimal solution Let’s say f(x) is the combination of the “centering term” c(x) and the “optimization term” z(x) = c T x. We would like this to have the same minimum over the feasible region as z(x) but can otherwise be quite different. In particular c(x) and hence f(x) need not be linear. Goal: find the minimum of f(x) over the feasible region starting at some interior point x 0 Can do this by taking a sequence of steps toward the minimum. How do we pick a direction for a step?

296.3Page 14 Picking a direction: steepest descent Option 1: Find the steepest descent on x at x 0 by taking the gradient: Problem: the gradient might be changing rapidly, so local steepest descent might not give us a good direction. Any ideas for better selection of a direction?

296.3Page 15 Picking a direction: Newton’s method To find the minimum of f(x) take the derivative and set to 0. Consider the truncated Taylor series: In matrix form, for arbitrary dimension: Hessian

296.3Page 16 Next Step? Now that we have a direction, what do we do?

296.3Page 17 Remaining on the support plane Constraint: Ax = b A is a n x (n + m) matrix. The equation describes an m dimensional hyperplane in a n+m dimensional space. The hyperplane basis is the null space of A A = defines the “slope” b = defines an “offset” x2x2 x1x1 3 x 1 + 2x 2 = 3 x 1 + 2x 2 = 4 4

296.3Page 18 Projection Need to project our direction onto the plane defined by the null space of A. We want to calculate Pc

296.3Page 19 Calculating Pc Pc = (I – A T (AA T ) -1 A)c = c – A T w where A T w = A T (AA T ) -1 Ac giving AA T w = AA T (AA T ) -1 Ac = Ac so all we need to do is solve for w in: AA T w = Ac This can be solved with a sparse solver as described in the graph separator lectures. This is the workhorse of the interior-point methods. Note that AA T will be more dense than A.

296.3Page 20 Next step? We now have a direction c and its projection d onto the constraint plane defined by Ax = b. What do we do now? To decide how far to go we can find the minimum of f(x) along the line defined by d. Not too hard if f(x) is reasonably nice (e.g., has one minimum along the line). Alternatively we can go some fraction of the way to the boundary (e.g., 90%)

296.3Page 21 General Interior Point Method Pick start x 0 Factor AA T (i.e., find LU decomposition) Repeat until done (within some threshold) –decide on function to optimize f(x) (might be the same for all iterations) –select direction d based on f(x) (e.g., with Newton’s method) –project d onto null space of A (using factored AA T and solving a linear system) –decide how far to go along that direction Caveat: every method is slightly different

296.3Page 22 Affine Scaling Method A biased steepest descent. On each iteration solve: minimize c T y subject to Ay = 0 y T D -2 y ≤ 1 Note that: 1.Ellipsoid is centered around current solution x 2.y is in the null space of A and can therefore be used as the direction d without projection 3.we are optimizing in the desired direction c T What does the Dikin Ellipsoid do for us? Dikin ellipsoid

296.3Page 23 Dikin Ellipsoid For x > 0 (a true “interior point”), x1x1 x2x2 x4x4 x3x3 x5x5 This constraint prevents y from crossing any face. Ay=0 keeps y on the right hyperplane. Optimal value on boundary of ellipsoid due to convexity. Ellipsoid biases search away from corners. y x

296.3Page 24 Finding Optimal minimize c T y subject to Ay = 0 y T D -2 y ≤ 1 But this looks harder to solve than a linear program! We now have a non-linear constraint y T D -2 y ≤ 1. Symmetry and lack of corners actually makes this easier to solve.

296.3Page 25 How to compute By substitution of variables: y = Dy’ minimize: c T Dy’ subject to: ADy’ = 0 y’ T DD -2 Dy’ ≤ 1  y’ T y’ ≤ 1 The sphere y’ T y’ ≤ 1 is unbiased. So we project the direction c T D = Dc onto the nullspace of B = AD: y’ = (I – B T (BB T ) -1 B)Dc and y = Dy’ = D (I – B T (BB T ) -1 B)Dc As before, solve for w in BB T w = BDc and y = D(Dc – B T w) = D 2 (c – A T w)

296.3Page 26 Affine Interior Point Method Pick start x 0 Symbolically factor AA T Repeat until done (within some threshold) –B = AD i –Solve BB T w = ADD T A T = BDc for w (use symbolically factored AA T … same non-zero structure) –d = D i (D i c – B T w) –move in direction d a fraction  of the way to the boundary (something like  =.96 is used in practice) Note that D i changes on each iteration since it depends on x i

296.3Page 27 Potential Reduction Method minimize: z = q ln(c T x – b T y) -  j=1 n ln(x j ) subject to: Ax = b x ≥ 0 yA + s = 0 (dual problem) s ≥ 0 First term of z is the optimization term The second term of z is the centering term. The objective function is not linear. Use hill climbing or “Newton Step” to optimize. (c T x – b T y) goes to 0 near the solution

296.3Page 28 Central Trajectory (log barrier) Dates back to 50s for nonlinear problems. On step i: 1.minimize: c T x -  k  j=1 n ln(x j ), s.t. Ax = b, x > 0 2.select:  k+1 ≤  k Each minimization can be done with a constrained Newton step.  k needs to approach zero to terminate. A primal-dual version using higher order approximations is currently the best interior- point method in practice.

296.3Page 29 Summary of Algorithms 1.Actual algorithms used in practice are very sophisticated 2.Practice matches theory reasonably well 3.Interior-point methods dominate when A.Large n B.Small Cholesky factors (i.e., low fill) C.Highly degenerate 4.Simplex dominates when starting from a previous solution very close to the final solution 5.Ellipsoid algorithm not currently practical 6.Large problems can take hours or days to solve. Parallelism is very important.