First Order Methods

This module aims to implement the first order methods for sparse signal recovery problems proposed in [BCandesG11]. The implementation is adapted from TFOCS [BCG12].

First order methods exploit information n values and gradients/subgradients (but not Hessians) of the functions comprising the models under consideration [Bec17].

API

Solvers

fom(smooth_f, prox_h, A, b, x0[, options])

First order methods driver routine

scd(prox_f, conj_neg_h, A, b, mu, x0, z0[, …])

First order solver for smooth conic dual problems driver routine

l1rls(A, b, lambda_, x0[, options])

Solver for l1 regulated least square problem

l1rls_jit(A, b, lambda_, x0[, options])

Solver for l1 regulated least square problem

lasso(A, b, tau, x0[, options])

Solver for LASSO problem

lasso_jit(A, b, tau, x0[, options])

Solver for LASSO problem

owl1rls(A, b, lambda_, x0[, options])

Solver for ordered weighted l1 norm regulated least square problem

owl1rls_jit(A, b, lambda_, x0[, options])

Solver for ordered weighted l1 norm regulated least square problem

dantzig_scd(A, b, delta, mu, x0, z0[, options])

Solver for the (smoothed) Dantzig selector problem using smoothed conic dual formulation

Data types

FomOptions

Options for FOCS driver routine

FomState

State of the FOCS method

Utilities

matrix_affine_func([A, b])

Returns an affine function for a matrix A and vector b

First Order Methods

[BCandesG11] considers problems in the unconstrained form:

(1)minimize ϕ(x)=g(x)+h(x)

where:

  • Both g,h:Rn(R) are convex functions.

  • g:RmR is a smooth convex function.

  • h:RnR is a non-smooth convex function.

Of particular interest are problems where g takes a specific form:

(2)g(x)=f(A(x)+b)

where

  • A is a linear operator from RnRm

  • b is a translation vector

  • f:RmR is a smooth convex function

and the function h is a prox-capable convex function (to be described later).

We can then rewrite ϕ as:

(3)minimize ϕ(x)=f(A(x)+b)+h(x)

For a smooth function, its gradient f must exist and be easy to compute.

For a prox-capable function, it should have an efficient proximal operator:

(4)pf(x,t)=argminzRn(f(x)+12tzx22)

for any xRn and the step size t>0.

See the following sections for details:

The routine fom() provides the solver for the minimization problem described above.

  • Unconstrained smooth minimization problems can be handled by choosing h(x)=0. See cr.sparse.opt.prox_zero().

  • Convex constraints can be handled by adding their indicator functions as part of h.

For solving the minimization problem, an initial solution x0 should be provided. If one is unsure of the initial solution, they can provide 0Rn as the initial solution.

Smooth Conic Dual Problems

A number of sparse signal recovery problems can be cast as conic optimization problems. Then efficient first order methods can be used to solve the problem. This procedure often involves the following steps:

  • Cast the sparse recovery problem as a conic optimization problem

  • Augment the objective function with a strongly convex term to make it smooth.

  • Formulate the dual of the smooth problem.

  • Solve the smooth conic dual problem using a first order method.

  • Apply continuation to mitigate the effect of the strongly convex term added to the original objective function.

Our goal is to solve the conic problems of the form [BCandesG11]

(5)minimizexf(x)subject toA(x)+bK

where:

  • xRn is the optimization variable

  • f:RnR is a convex objective function which is possibly extended valued but not necessarily smooth

  • A:RnRm is a linear operator

  • bRm is a translation vector

  • KRm is a closed, convex cone

Dual Problem

The Lagrangian associated with the conic problem is given by:

(6)L(x,λ)=f(x)λ,A(x)+b

where λRm is the Lagrange multiplier which must lie in the dual cone K.

The dual function is:

(7)g(λ)=infxRnL(x,λ)=f(A(λ))b,λ
  • A:RmRn is the adjoint of the linear operator A

  • f:Rn(R) is the convex conjugate of f defined by

(8)f(z)=supzz,xf(x)

Thus, the dual problem is given by:

(9)maximizeλf(A(λ))b,λsubject toλK

Smoothing

The dual function g may not be differentiable (or finite) on all of K.

We perturb the original problem as follows:

(10)minimizexfμ(x)f(x)+μd(x)subject toA(x)+bK
  • μ>0 is a fixed smoothing parameter

  • d:RnR is a strongly convex function obeying

(11)d(x)d(x0)+12xx022

A specific choice for d(x) is:

(12)d(x)=12xx022

The new objective function fμ is strongly convex.

The Lagrangian becomes:

(13)Lμ(x,λ)=f(x)+μd(x)λ,A(x)+b

The dual function becomes:

(14)gμ(λ)=infxRnLμ(x,λ)=(f+μd)(A(λ))b,λ

First order methods may be employed to solve this problem with provable performance.

The smoothed conic dual problem is then given by:

(15)maximizeλ(f+μd)(A(λ))b,λsubject toλK

Composite Forms

Often, it is convenient to split the dual variable λRm as λ=(z,τ) where zRmm¯ and τRm¯. For example, if K=L1n, the 1 norm cone (i.e. the epigraph of the 1 norm function, then, if we split λRn+1 as λ=(z,τ) where zRn and τR, then:

(16)λL1nz1τ

This is particularly useful, if the smoothed dual function gμ(λ)=gμ(z,τ) is linear in τ. Then, we can rewrite gμ as:

(17)gμ(λ)=gsm(z)νμ,τ

Since gμ is a smooth concave function, gsm will be a smooth convex function.

Problem Instantiations

In the rest of the document, we will discuss how specific sparse signal recovery problems can be modeled and solved using this module as either primal problems or a smooth conic dual problem.

L1 regularized least square problem

We consider the problem:

(18)minimize12Axb22+λx1

Choose:

With these choices, it is straight-forward to use fom() to solve the L1RLS problem. This is implemented in the function l1rls().

LASSO

LASSO (least absolute shrinkage and selection operator) s a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model.

We consider the problem:

(19)minimizex12Axb22subject to x1τ

Choose:

With these choices, it is straight-forward to use fom() to solve the LASSO problem. This is implemented in the function lasso().

Ordered weighted L1 regularized least square problem

We consider the problem:

(20)minimizexRn12Axb22+i=1nλi|x|(i)

described in [lBvdBSC13].

Choose:

With these choices, it is straight-forward to use fom() to solve the ordered weighted L1 regularized least square problem. This is implemented in the function owl1rls().

The Dantzig selector

We wish to recover an unknown vector x0Rn from the data yRm and the model:

(21)y=Ax0+z

where

  • A is a known (m,n) sized design matrix

  • z is the noise term

  • There are fewer observations/measurements than unknowns (m<n)

We consider the optimization problem

(22)minimizexx1subject toA(yAx)δ
  • We are minimizing the 1 norm of the solution (thus promoting sparsity)

  • We have an upper bound δ on the correlation between the residual vector r=yAx and the columns/atoms of A.

This formulation is known as the Dantzig selector.

The conic form

We can rewrite the Dantzig selector problem as:

(23)minimizexx1subject to(A(yAx),δ)Ln

where Ln is the epigraph of the -norm. It is a convex cone.

Then the Dantzig selector can be modeled as a standard conic form as follows:

  • f(x)=x1

  • A(x)=(AAx,0) is a mapping from RnRn+1

  • b=(Ay,δ); note that bRn+1

  • K=Ln

Note carefully that:

(24)A(x)+b=(AAx,0)+(Ay,δ)=(A(yAx),δ)

Thus:

(25)A(x)+bK=LnA(yAx)δ

Dual problem

The dual of the Ln cone is the L1n cone. The dual variable λ will lie in the dual cone L1n. It will be easier to work with defining λ=(z,τ) such that

(26)λL1nz1τ

The convex conjugate of the l1 norm function f(x)=x1 is the indicator function for the norm ball.

(27)f(x)=I(x)={0if x1otherwise

The adjoint of the linear operator is given by:

(28)A(λ)=A((z,τ))=AAz

Plugging into the dual problem definition:

(29)maximizeλf(A(λ))b,λsubject toλK

We get:

(30)maximizezI(AAz)Ay,zδτsubject to(z,τ)L1n

For δ>0, the solution λ will be on the boundary of the convex cone L1n, giving us:

(31)z1=τ

Plugging this back, we get the unconstrained maximization problem:

(32)maximizezI(AAz)Ay,zδz1