# Robust Optimization Concepts and Examples

## Presentation on theme: "Robust Optimization Concepts and Examples"— Presentation transcript:

Robust Optimization Concepts and Examples
Yuriy Zinchenko Shane G. Henderson ORIE, Cornell University

Zinchenko and Henderson 2005
Outline What can go wrong with LP? A familiar blend problem The general picture Robust linear programming Software, resources, practicalities Radiation therapy for cancer treatment Zinchenko and Henderson 2005

What can go wrong with LP?
Tough LP problem: max x + y s/t 1 x  y  1 x, y  0 ? Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Blend Problem but properties change with time \$\$ \$\$\$ \$ blend to get output properties at minimum cost for any input properties within reason Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Blend constraints Typical constraint looks like Low ≤ 10 x x2 + 7 x3 ≤ High Changes to Low ≤ a1 x1 + a2 x2 + a3 x3 ≤ High for any vector a that is “close” to (10, 12, 7) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
General robust LP min cTx s/t A(1) x  b1 A(2) x  b2 A(3) x  b3 Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
A more detailed view Simple linear constraint a x  1 x  0 with a “close” to 1, namely 0  a  2 Want x to work for all such a How do we deal with it? Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
a x  1, x  0 for all 0  a  2 max a x  1, x  0 0  a  2, x  0 2 x  1 , x  0 x  1/2 , x  0 Zinchenko and Henderson 2005

A slightly more involved example: a x + b y  1
where (a, b) “close” to (1, 1), namely in Ellipsoidal (spherical) “uncertainty” set U (a, b) is in U if (a, b) = (a0, b0) + (Da, Db) with (a0, b0) = (1, 1) and Da2 + Db2  1 Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Ellipsoidal “uncertainty” set U (a, b) = (a0, b0) + (Da, Db) (a0, b0) = (1, 1) Da2 + Db2  1 Want (x, y) to satisfy a x + b y  1, for all (a, b) from U U (a0, b0) Zinchenko and Henderson 2005

What can we say about a x + b y ?
a x + b y  1 for all (a, b) in U max a x + b y  1 (a, b) in U What can we say about a x + b y ? a x + b y = (a0 + Da) x + (b0 + Db) y = (a0 x + b0 y) + (Da x + Db y) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
For a moment, think of (x, y) as your objective function (fixed) max a x + b y ( 1 ?) (a, b) in U same as (a0 x + b0 y) + max (Da x + Db y) ( 1 ?) Da2 + Db2  1 (x, y) U (a0, b0) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
max (Da x + Db y) ( 1 - (a0 x + b0 y) ?) Da2 + Db2  1 Here Da x + Db y  ||(x, y)|| = (x2 + y2)1/2 the “length” of (x, y) (x1, y1) U (x2, y2) (a0, b0) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
a x + b y  for all (a, b) in U max a x + b y  1 (a, b) in U (a0 x + b0 y) + max (Da x + Db y)  Da2 + Db2  1 ||(x, y)||  1 - (a0 x + b0 y) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Good news Can handle constraints of this type ||(x, y)||  1 - (1 x + 1 y) easily (the so-called second-order conic programming (SOCP)) Not much harder than linear programming! Zinchenko and Henderson 2005

General Robust LP formulation
max cTx s/t A(i) x  bi, i = 1,…,m where c, x Î Rn, A(i) Î R1 x n, A(i)=A(i)0 + wi Pi with wi Î R1 x ki, ||wi||  1, i=1,…,m, Pi Î Rki x n Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
SOCP equivalent: max cTx s/t || Pi x ||  bi - A(i)0 x, i = 1,…,m Probabilistic interpretation: think of A(i) taken from an a-level set of your favorite probability distribution (e.g. multivariate normal) the robust constraint will read satisfy the constraint with a given probability a Zinchenko and Henderson 2005

Where’d the ellipse come from?
Expert opinion Statistics: Averages live in ellipsoids Doesn’t have to be an ellipse. Can be some other shape (e.g., boxes) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Software Commercial: Mosek (http://www.mosek.com/) “Free”: SeDuMi (http://sedumi.mcmaster.ca/) SDPT3.x (http://www.math.nus.edu.sg/~mattohkc/sdpt3.html/) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Practicalities Realistic problem sizes number of variables/constraints on the order of 103 – 104 depends (greatly) on the problem data structure/sparsity Possible to obtain a “good”, “inexpensive” approximation with LP Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Generality Possible to extend this approach to quite a few other convex programming problems Resources Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications by A. Ben-Tal, A. S. Nemirovskii Google for Robust Optimization (robust LP etc.) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
In practice... Joint work with Millie Chu (Cornell) and Michael B. Sharpe (Princess Margaret Hospital, Toronto) Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Cancer treatment About 1.3 million new cancer cases in the U.S. each year Nearly 60% receive radiation therapy (in conjunction with surgery, chemotherapy etc) Zinchenko and Henderson 2005

Radiation delivered by a linear accelerator Cancer cells more susceptible than normal cells Overlay beams from different angles Dose given in daily fractions for ~ 6 weeks Zinchenko and Henderson 2005

Block parts of the radiation beam – discretize the whole beam into a grid of smaller “beamlets” Choose different intensities for each beamlet Intensity Modulated Radiation Therapy Collaborative Working Group, 2001 Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Treatment Planning Goal: Choose beam angles and beamlet intensities that deliver enough radiation to kill all tumor cells, while avoiding healthy organs & tissue as much as possible Take CT scan Delineate target region and healthy structures Discretize body as small cubes, or “voxels” Formulate & solve a mathematical program to find a “good” plan Zinchenko and Henderson 2005 Princess Margaret Hospital

Robust Treatment Planning
Setup errors + Patient motion + Structural changes during treatment = uncertainty in geometry Don’t rescan patient much if at all Use RO to “robustify” mathematical program Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Model Formulation • Many different formulations exist – we use a penalty formulation minimize: penalty objective subject to: Pr(Dose to voxel i in healthy structure k ≤ Uk) ≥ 0.95 Pr(Dose to voxel i in tumor ≥ L) ≥ 0.95 x = beamlet intensities ≥ 0 Zinchenko and Henderson 2005

Computational Results
Prostate: tumor + 5 healthy regions 5 equi-spaced beams, ~ 225 beamlets from each angle Voxel size = 2 cm, ~ 400 total voxels Solver: Mosek, v Solve time = 6 seconds (LP), 45 minutes (SOCP) Zinchenko and Henderson 2005

Dose-Volume Histograms
% of structure receiving ≥ x Gy deterministic solution’s plan DVH of expected dose stochastic solution’s plan Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Comparison Simulate 10 treatments (45 fractions each) For each of the 10 treatments, and for each solution (deterministic & stochastic), calculated dose delivered to each voxel in each fraction summed over the 45 fractions to get total dose delivered to each voxel plotted DVH Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 1 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 2 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 3 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 4 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 5 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 6 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 7 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 8 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 9 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
DVH – Treatment 10 det stoch Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
Conclusions LP “pushes you into a corner” True situation never same as data Robust LP: Find good solution that is always feasible within reason Efficient solution methods: can solve real problems Software available The End Zinchenko and Henderson 2005

Zinchenko and Henderson 2005
A Bit More Detail Di(x) = Dose delivered to voxel i in N fractions, with intensities x, a random variable Di(x) is the sum of N random variables (N = 45), assume iid, apply CLT, so Di(x) is approximately normally distributed Take n sample shifts, s1,...,sn, with associated probabilities p = (p1,...,pn)T Let ai(∙)T = ai(s1)T ai(s2)T dose delivered to voxel i, shifted by sj, from each beamlet with unit intensity ai(sn)T so that NpTai(∙)Tx = expected total dose delivered to voxel i, for N fractions. Let vi(x) = sample variance of dose delivered to voxel i  Di(x) ~ Normal ( NpTai(·)Tx, Nvi(x) ) Zinchenko and Henderson 2005

Probabilistic Constraints
Want constraints to be violated with low probability (say, δ = .05) Example: maximum dose constraint on voxel i in Hk: Assuming Di(x) ~ Normal ( NpTai(∙)Tx, Nvi(x) ), mk Want P(Di(x) > mk) ≤ δ Second order cone program (SOCP) Zinchenko and Henderson 2005

Dose-Volume Constraints
Physicians like constraints of form: “<= fraction fk of structure Hk gets >= dk” 0-1 var for each voxel: = 1 if dose is > dk. MIP: Hard to solve! Many voxels get near max allowed dose Alternative: upper bound the “excess” dose. For healthy structure Hk, we require: Linear constraints☺ Zinchenko and Henderson 2005