Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bregman Iterative Algorithms for L1 Minimization with Applications to Compressed Sensing W. Yin, S. O., D. Goldfarb, J. Darbon Problem: Let Basis Pursuit:

Similar presentations


Presentation on theme: "Bregman Iterative Algorithms for L1 Minimization with Applications to Compressed Sensing W. Yin, S. O., D. Goldfarb, J. Darbon Problem: Let Basis Pursuit:"— Presentation transcript:

1 Bregman Iterative Algorithms for L1 Minimization with Applications to Compressed Sensing W. Yin, S. O., D. Goldfarb, J. Darbon Problem: Let Basis Pursuit: (S. Chen, D. Donoho, M.A. Saunders) (BP) m < n (usually m << n)

2 Basis Pursuit Arises in Compressed Sensing: (Candes, Romberg, Tao, Donoho, Tanner, Tsaig, Rudelson, Vershynin, Tropp) Fundamental principle: Through optimization, the sparsity of a signal can be exploited for signal recovery from incomplete measurements Letbe highly sparse i.e.

3 Principle: Encodeby Then recoverfrom f by solving basis pursuit

4 Proven: [Candes, Tao] Recovery is perfect,whenever k,m,n satisfy certain conditions Type of matrices A allowing high compression rations (m << n) include (a)Random matrices with i.i.d. entries (b)Random ensembles of orthonormal transforms (e.g. matrices formed from random sets of the rows of Fourier transforms)

5 Huge number of potential applications of compressive sensing See e.g. Rich Baraniuk’s website: minimization is widely used for compressive imaging, MRI and CT, multisensor networks and distributive sensing, analog- to-information conversion and biosensing (BP) can be transformed into a linear program, then solved by conventional methods. Not tailored for A large scale; dense; Also doesn’t use orthonormality for a Fourier matrix, etc.

6 One might solve the unconstrained problem (UNC) Need  to be small to heavily weight the fidelity term. Also the solution to (UNC) never is that of (BP) unless f = 0 Here: Using Bregman iteration regularization we solve (BP) by a very small number of solutions to (UNC) with different values of f.

7 Method involves only (a)Matrix-vector multiplications (b)Component-wise shrinkages Method generalizes to the constrained problem For other convex J Can solve this through a finite number of Bregman iterations of (again, with a sequence of “f ” values)

8 Also: we have a two-line algorithm only involving matrix-vector multiplication and shrinkage operators generating {u k } that converges rapidly to an approximate solution of (BP) In fact the numerical evidence is overwhelming that it converges to a true solution if  is large enough. Also: Algorithms are robust with respect to noise, both experimentally and with theoretical justification.

9 Background To solve (UNC): Figueiredo, Nowak and Wright Kim, Koh, Lustig and Boyd van den Berg and Friedlander Shrinkage (soft thresholding) with iteration used by: Chambolle, DeVore, Lee and Lucier Figueiredo and Nowak Daubechies, De Frise and DeMul Elad, Matalon and Zibulevsky Hale, Yin and Zhang Darbon and Osher Combettes and Pesquet

10 The shrinkage people developed an algorithm to solve for convex differentiable H() and get an iterative scheme: Since u is component-wise separable, we can solve by scalar shrinkage. Crucial for the speed!

11 where for y,   R, define i.e., make this a semi-implicit method (in numerical analysis terms) Or replace H(u) by first order Taylor expansion at u k : and force u to be close to u k by the penalty term

12 This was adapted for solving and the resulting “linearized” approach was solved by a graph|network based algorithm, very fast. Darbon and Osher; Wang, Yin and Zhang. Also: Darbon and Osher did the linearized Bregman approached described here, but for TV deconvolution:

13 Bregman Iterative Regularization (Bregman 1967) Introduced by Osher, Burger, Goldfarb, Xu and Yin in an image processing context. Extended the Rudin-Osher-Fatemi model (ROF): b a noisy measurement of a clean image and  is a tuning parameter. They used the Bregman distance based on

14 Not a distance really (unless J is quadratic) Howeverfor all w on the line segment connecting u and v. Instead of solving (ROF) once, our Bregman iterative regularization procedure solves (BROF) forstarting with u 0 = 0, p 0 = 0 (gives (ROF) for u 1 ) The p is automatically chosen from optimality

15 Difference is in the use of regularization. Bregman iterative regularization regularizes by minimizing the total variation based Bregman distance from u to the previous u k Earlier results: (a) converges monotonically to zero unknown (b) u k gets closer to the unknown noisy image in the sense of Bregman distance diminishes in k at least as long as Numerically, it’s a big improvement.

16 For all k (BROF), the iterative procedure, can be reduced to ROF with the input add i.e. add back the noise. This is totally general. Algorithm: Bregman iterative regularization (for J(u), H(u) convex, H differentiable) Results: The iterative sequence {u k } solves: (1) Monotonic decrease in H: (2) Convergence to the original in H with exact data:

17 (3) Approach towards the original in D with noisy data Let and suppose represent noisy data, noiseless data, perfect recovery, and noise level); then as long as

18 Motivation: Xu, Osher (2006) Wavelet based denoising with {  j } a wavelet basis. Then solve Decouples: (observed (1998) by Chambolle, DeVore, Lee and Lucier)

19 This is soft thresholding Interesting: Bregman iterations give i.e. firm thresholding So for Bregman iterations it takes iterations to recover Spikes return in decreasing orders of their magnitudes and sparse data comes back very quickly.

20 Next: Simple case where Obvious solution: a j is component of a with largest magnitude.

21 assume a j = a 1 > 0, f > 0 and a 1 strictly greater than all the other a. Then It is easy to see that the Bregman iterations give an exact solution in steps! This helps explain our success in the general case.

22 Convergence results: Again, the procedure Here Recent fast method (FPC) of Hale, Yin, Zhang to compute

23 This is nonlinear Bregman. Converges in a few iterations. However, even faster is linearized Bregman (Darbon-Osher, use for TV deblurring) described below 2 LINE CODE For nonlinear Bregman Theorem: Suppose an iterate u k satisfies Au k = f. Then u k solves (BP). Proof: By nonegativity of the Bregman distance, for any u

24 Theorem There exists an integer K <  such that any is a solution of (BP) Idea: uses the fact that Works if we replace  by for all k.

25

26 For dense Gaussian matrices A, we can solve large scale problem instances with more than 8  106 nonzeros in A e.g. n  m 4096  2045 in 11 seconds. For partial DCT matrices, much faster 1,000,000  600,000 in 7 minutes But more like 40 seconds for the linearized Bregman approach! Also, can’t use minimizer for  very small. Takes too long Need Bregman

27 Extensions Finite Convergence Letbe convex on H, Hilbert space,

28 Thm: Let H(u) = h(Au – f), h convex, differentiable nonnegative, vanishing only at 0. Then Bregman iteration returns a solution of under very general conditions. then Idea: etc.

29 Strictly convex cases e.g. regularize, for Then Let Simple to prove.

30 Theorem: thedecays exponentially to zero and easy.

31 Linearized Bregman Started with Osher-Darbon let Differs from standard Bregman because we replace by the sum of its first order approximation at u k and onproximity term at u k. Then we can use fast methods, either graph cuts for TV or shrinkage for to solve the above!!

32 yields Consider (BP). Let

33 Get a 2 line code: Linearized Bregman: Two Lines Matrix multiplication and scalar shrinkage.

34 Theorem: Let J be strictly convex and C 2 and u OPT an optimal solution of Then if u k  w we have decays exponentially if Proof is easy So for J(u) =  |u| 1 this would mean that w approaches a minimize of ||u|| 1 subject to Au = f, as   .

35 Theorem:(don’t need strict convexity and smoothness of J for this) then Proof easily follows from Osher, Burger, Goldfarb, Xu, Yin.

36 (again, don’t need strict convexity and smoothness) NOISE: Theorem (follows Bachmyer) Then the generalized Bregman distance diminishes with increasing k, as long as:

37 i.e. as long as the error Au k – f is not too small compared to the error in the “denoised” solution Of course ifis the solution of the Basis Pursuit problem, then this Bregman distance monotonically decreases.

38 Note, this means for Basis Pursuit is diminishing for these values of k. Here belongs to [-1,1], determined by the iterative procedure.


Download ppt "Bregman Iterative Algorithms for L1 Minimization with Applications to Compressed Sensing W. Yin, S. O., D. Goldfarb, J. Darbon Problem: Let Basis Pursuit:"

Similar presentations


Ads by Google