Presentation is loading. Please wait.

Presentation is loading. Please wait.

Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi

Similar presentations


Presentation on theme: "Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi"— Presentation transcript:

1 Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi
Difference of Convex (DC) Decomposition of Nonconvex Polynomials with Algebraic Techniques Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi 7/13/2015 MOPTA 2015

2 Difference of Convex (DC) programming
Problems of the form min 𝑓 0 (𝑥) 𝑠.𝑡. 𝑓 𝑖 𝑥 ≤0 where: 𝑓 𝑖 𝑥 ≔ 𝑔 𝑖 𝑥 − ℎ 𝑖 𝑥 , 𝑖=0,…,𝑚, 𝑔 𝑖 : ℝ 𝑛 →ℝ, ℎ 𝑖 : ℝ 𝑛 →ℝ are convex.

3 Concave-Convex Computational Procedure (CCCP)
Heuristic for minimizing DC programming problems. Has been used extensively in: machine learning (sparse support vector machines (SVM), transductive SVMs, sparse principal component analysis) statistical physics (minimizing Bethe and Kikuchi free energies). Idea: Input 𝑘≔0 x 𝑥 0 , initial point 𝑓 𝑖 = 𝑔 𝑖 − ℎ 𝑖 , 𝑖=0,…,𝑚 Convexify by linearizing 𝒉 x 𝒇 𝒊 𝒌 𝒙 = 𝑔 𝑖 𝑥 −( ℎ 𝑖 𝑥 𝑘 +𝛻 ℎ 𝑖 𝑥 𝑘 𝑇 𝑥− 𝑥 𝑘 ) Solve convex subproblem Take 𝑥 𝑘+1 to be the solution of min 𝑓 0 𝑘 𝑥 𝑠.𝑡. 𝑓 𝑖 𝑘 𝑥 ≤0, 𝑖=1,…,𝑚 convex convex affine 𝑘≔𝑘+1 𝒇 𝒊 𝒌 𝒙 𝒇 𝒊 (𝒙)

4 Concave-Convex Computational Procedure (CCCP)
Toy example: min 𝑥 𝑓 𝑥 , where 𝑓 𝑥 ≔𝑔 𝑥 −ℎ(𝑥) Convexify 𝑓 𝑥 to obtain 𝑓 0 (𝑥) Initial point: 𝑥 0 =2 Minimize 𝑓 0 (𝑥) and obtain 𝑥 1 Reiterate 𝑥 ∞ 𝑥 4 𝑥 3 𝑥 2 𝑥 1 𝑥 0 𝑥 0

5 CCCP for nonconvex polynomial optimization problems (1/2)
CCCP relies on input functions being given as a difference of convex functions. We will consider polynomials in 𝑛 variables and of degree 𝑑. Any polynomial can be written as a difference of convex polynomials. Proof by Wang, Schwing and Urtasun Alternative proof given later in this presentation, as corollary of stronger theorem What if we don’t have access to such a decomposition?

6 CCCP for nonconvex polynomial optimization problems (2/2)
In fact, for any polynomial, ∃ an infinite number of decompositions. 𝑓 𝑥 =𝑔 𝑥 −ℎ(𝑥) Example x𝑓(𝑥)= 𝑥 4 −3 𝑥 2 +2𝑥−2 Possible decompositions 𝑔 𝑥 = 𝑥 4 , ℎ 𝑥 =3 𝑥 2 −2𝑥+2 𝑔 𝑥 = 𝑥 4 + 𝒙 𝟐 , ℎ 𝑥 =3 𝑥 2 + 𝒙 𝟐 −2𝑥+2, etc. Which one would be a natural choice for CCCP?

7 Picking the “best” decomposition (1/2)
Algorithm Linearize 𝒉 𝒙 around a point 𝑥 𝑘 to obtain convexified version of 𝒇(𝒙) Idea Pick ℎ 𝑥 such that it is as close as possible to affine Mathematical translation Minimize curvature of ℎ ( 𝐻 ℎ is the hessian of ℎ) At a point 𝒂 min g,h 𝜆 𝑚𝑎𝑥 ( 𝐻 ℎ 𝑎 ) s.t. 𝑓=𝑔−ℎ 𝑔,ℎ convex Over a region 𝛀 min 𝑔,ℎ max 𝑥∈Ω 𝜆 𝑚𝑎𝑥 ( 𝐻 ℎ 𝑥 ) s.t. 𝑓=𝑔−ℎ, 𝑔,ℎ convex

8 Picking the “best” decomposition (2/2)
Theorem: Finding the “best” decomposition of a degree-4 polynomial over a box is NP-hard. Proof idea: Reduction via testing convexity of quartic polynomials is hard (Ahmadi, Olshevsky, Parrilo, Tsitsiklis). The same is likely to hold for the point version, but we have been unable to prove it. How can we efficiently find such a decomposition?

9 Convex relaxations for DC decompositions (1/6)
SOS, DSOS, SDSOS polynomials (Ahmadi, Majumdar) Families of nonnegative polynomials. Type Characterization Testing membership Sum of squares (sos) ∃ 𝑞 𝑖 , polynomials, s.t. 𝑝(𝑥)=∑ 𝑞 𝑖 2 (𝑥) SDP Scaled diagonally dominant sum of squares (sdsos) p= 𝑖 𝛼 𝑖 𝑚 𝑖 𝑖,𝑗 𝛽 𝑖 + 𝑚 𝑖 + 𝛾 𝑗 + 𝑚 𝑗 𝛽 𝑖 − 𝑚 𝑖 − 𝛾 𝑗 − 𝑚 𝑗 2 𝑚 𝑖 , 𝑚 𝑗 monomials, 𝛼 𝑖 ≥0 SOCP Diagonally dominant sum of squares (dsos) p= 𝑖 𝛼 𝑖 𝑚 𝑖 2 + 𝑖,𝑗 𝛽 𝑖,𝑗 + 𝑚 𝑖 + 𝑚 𝑗 𝛽 𝑖,𝑗 − 𝑚 𝑖 − 𝑚 𝑗 2 𝑚 𝑖 , 𝑚 𝑗 monomials, 𝛼 𝑖 , 𝛽 𝑖𝑗 +,− ≥0 LP

10 Convex relaxations for DC decompositions (2/6)
DSOS-convex, SDSOS-convex, SOS-convex polynomials Definitions: 𝑝 is dsos-convex if 𝑦 𝑇 𝐻 𝑝 𝑥 𝑦 is dsos. 𝑝 is sdsos-convex if 𝑦 𝑇 𝐻 𝑝 𝑥 𝑦 is sdsos. 𝑝 is sos-convex if 𝑦 𝑇 𝐻 𝑝 𝑥 𝑦 is sos. 𝑝(𝑥) convex 𝑦 𝑇 𝐻 𝑝 𝑥 𝑦≥0, ∀𝑥,𝑦∈ ℝ 𝑛 𝑦 𝑇 𝐻 𝑝 𝑥 𝑦 sos/sdsos/dsos 𝐻 𝑝 𝑥 ≽0, ∀𝑥 LP SOCP SDP

11 Convex relaxations for DC decompositions (3/6)
Comparison of these sets on a parametric family of polynomials: 𝑝 𝑥 1 , 𝑥 2 =2 𝑥 𝑥 2 4 +𝑎 𝑥 1 3 𝑥 2 +𝑏 𝑥 1 2 𝑥 2 2 +𝑐 𝑥 1 𝑥 2 3 𝑐=−0.5 𝑐=0 𝑐=1 𝑏 𝑏 𝑏 𝑎 𝑎 𝑎 dsos-convex sdsos-convex sos-convex=convex

12 Convex relaxations for DC decompositions (4/6)
How to use these concepts to do DC decomposition at a point 𝑎? Original problem min 𝜆 𝑚𝑎𝑥 ( 𝐻 ℎ 𝑎 ) s.t. 𝑓=𝑔−ℎ 𝑔,ℎ convex min 𝑡 s.t. 𝐻 ℎ 𝑎 ≼𝑡𝐼 𝑓=𝑔−ℎ Relaxation 1: sos-convex min 𝑡 s.t. 𝐻 ℎ 𝑎 ≼𝑡𝐼 𝑓=𝑔−ℎ 𝑔,ℎ sos-convex SDP Relaxation 2: sdsos-convex 𝑔,ℎ sdsos-convex SOCP + “small” SDP Relaxation 3: dsos-convex 𝑔,ℎ dsos-convex LP + “small” SDP Relaxation 4: sdsos-convex+sdd min 𝑡 s.t. 𝒕𝑰−𝑯 𝒉 𝒂 sdd (**) 𝑓=𝑔−ℎ 𝑔,ℎ, sdsos-convex SOCP Relaxation 5: dsos-convex + dd s.t. 𝒕𝑰− 𝑯 𝒉 𝒂 dd (*) 𝑔,ℎ, dsos-convex LP ∗ 𝑄 is diagonally dominant (dd) ⇔ 𝑗 𝑞 𝑖𝑗 < 𝑞 𝑖𝑖 , ∀𝑖 ∗∗ 𝑄 is sdd ⇔∃𝐷>0 diagonal, s.t. 𝐷𝑄𝐷 dd.

13 Convex relaxations for DC decompositions (5/6)
Can any polynomial be written as the difference of two dsos/sdsos/sos convex polynomials? Lemma about cones: Let 𝐾⊆𝐸 a full dimensional cone (𝐸, any vector space). Then any 𝑣∈𝐸 can be written as 𝑣= 𝑘 1 − 𝑘 2 , 𝑘 1 , 𝑘 2 ∈𝐾. Proof sketch: =:𝑘′ ∃ 𝛼<1 such that 1−𝛼 𝑣+𝛼𝑘∈𝐾 E K ⇔𝑣= 1 1−𝛼 𝑘 ′ − 𝛼 1−𝛼 𝑘 𝒌 𝒌′ 𝒗 𝑘 1 ∈𝐾 𝑘 2 ∈𝐾

14 Convex relaxations for DC decompositions (6/6)
Theorem: Any polynomial can be written as the difference of two dsos- convex polynomials. Corollary: Same holds for sdsos-convex, sos-convex and convex. Proof idea: Need to show that dsos-convex polynomials is full-dimensional cone. “Obvious” choices (i.e., 𝑝 𝑥 =( 𝑖 𝑥 𝑖 2 ) 𝑑/2 ) do not work. Induction on 𝑛: for 𝑛=2, take 𝑝 𝑥 1 , 𝑥 2 = 𝑎 0 𝑥 1 𝑑 + 𝑎 1 𝑥 1 𝑑−2 𝑥 2 2 +…+ 𝑎 𝑑 4 𝑥 1 𝑑 2 𝑥 2 𝑑 2 +…+ 𝑎 1 𝑥 1 2 𝑥 2 𝑑−2 + 𝑎 0 𝑥 2 𝑑 𝑎 0 > 2 𝑑−2 𝑑(𝑑−1) + 𝑑 4(𝑑−1) 𝑎 𝑑 4 𝑎 𝑘+1 = 𝑑−2𝑘 2𝑘+2 𝑎 𝑘 , 𝑘=1,…, 𝑑 4 −1 𝑎 1 =1

15 Comparing the different relaxations (1/4)
Impact of relaxations on solving for random 𝑓 (𝑑=4). min 𝑡,𝑔,ℎ 𝑡 s.t. 𝑡𝐼− 𝐻 ℎ 𝑎 psd/sdd/dd 𝑓=𝑔−ℎ, 𝑔,ℎ s/d/sos-convex Type of relaxation 𝒏=𝟔 𝒏=𝟏𝟎 𝒏=𝟏𝟔 Time (s) Opt value Opt Value dsos-convex + dd 1.05 2.79 20.80 dsos-convex + psd 1.19 3.19 25.36 sdsos-convex + sdd 1.21 5.17 34.66 sdsos-convex + psd 5.29 39.43 sos-convex + psd MOSEK 2.02 193.07 93.74 317.63 +∞ sos-convex + psd SEDUMI 11.48 193.06 Computer: 8Gb RAM, 2.40GHz processor Nearly 3hrs

16 Comparing the different relaxations (2/4)
Iterative decomposition algorithm implemented for unconstrained 𝑓. Decompose 𝒇=𝒈−𝒉, using one of the relaxations at point 𝑥 𝑘 Minimize convexified 𝒇 𝒌 , using an SDP subroutine [Lasserre; de Klerk and Laurent] Value of the objective after 3 mins. Algorithm given above. 5 different relaxations used 𝑓 random with 𝑛=9, 𝑑=4 Average over 25 iterations Solver: Mosek

17 Comparing the different relaxations (3/4)
Constrained case: min 𝑥∈𝐵 𝑓(𝑥) , where 𝐵= 𝑥 𝑖 𝑥 𝑖 2 ≤ 𝑅 2 }. Single decomposition vs Iterative decomposition vs One min-max decomp. Decompose 𝒇=𝒈−𝒉, once at 𝑥 0 Relaxation: min 𝑡 s.t. 𝐻 ℎ 𝑎 ≼𝑡𝐼 𝑓=𝑔−ℎ 𝑔,ℎ sdsos convex Decompose 𝒇=𝒈−𝒉, at a point 𝑥 𝑘 Relaxation: min 𝑡 s.t. 𝐻 ℎ 𝑎 ≼𝑡𝐼 𝑓=𝑔−ℎ 𝑔,ℎ sdsos convex Decompose 𝒇=𝒈−𝒉 over B What relaxation to use? Minimize convexified 𝒇 𝒌 Minimize convexified 𝒇 𝒌 Minimize convexified 𝒇 𝒌 Second relaxation: min 𝑡,𝑔,ℎ 𝑡 𝒕𝑰− 𝑯 𝒉 𝒙 ≽ 𝑹 𝟐 −∑ 𝒙 𝒊 𝟐 𝝉(𝒙) 𝒚 𝑻 𝝉(𝒙)𝒚 sos 𝑓=𝑔−ℎ 𝑔,ℎ sdsos-convex First relaxation: min 𝑡,𝑔,ℎ 𝑡 𝑥∈𝐵⇒𝑡𝐼− 𝐻 ℎ 𝑥 ≽0 𝑓=𝑔−ℎ 𝒈,𝒉 sdsos-convex Equivalent formulation: min 𝑡,𝑔,ℎ 𝑡 𝒙∈𝑩⇒𝒕𝑰− 𝑯 𝒉 𝒙 ≽𝟎 𝑓=𝑔−ℎ 𝑔,ℎ convex Original problem: min 𝑔,ℎ max 𝑥∈Ω 𝜆 𝑚𝑎𝑥 ( 𝐻 ℎ 𝑥 ) s.t. 𝑓=𝑔−ℎ, 𝑔,ℎ convex

18 Comparing the different relaxations (4/4)
Constrained case: single decomposition vs. iterative decomposition vs. min-max decomposition Value of the objective after 3 mins. Algorithms described above. 𝑓 random with 𝑛=10, 𝑑=4 Radius 𝑅 random integer between 100 and 400. Average over 200 iterations

19 Main messages To apply CCCP to polynomial optimization, a DC decomposition is needed. Choice of decomposition impacts convergence speed. Not computationally tractable to find “best” decomposition. Efficient convex relaxations based on the concepts of dsos-convex (LP), sdsos-convex (SOCP), and sos-convex (SDP) polynomials. Dsos-convex and sdsos-convex scale to a larger number of variables.

20 Thank you for listening
Questions?


Download ppt "Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi"

Similar presentations


Ads by Google