The alternating descent conditional gradient method
A ubiquitous prior in modern statistical signal processing asserts that an observed signal is the noisy measurement of a few weighted sources. In other words, compared to the entire dictionary of possible sources, the set of sources actually present is sparse. In many cases of practical interest the sources are parameterized and the measurement of multiple weighted sources is linear in their individual measurements.
As a concrete example, consider the idealized task of identifying the aircraft that lead to an observed radar signal. The sources are the aircraft themselves, and each is parameterized by, perhaps, its position and velocity relative to the radar detectors. The sparse inverse problem is to recover the number of aircraft present, along with each of their parameters.
sparse inverse problem
We want to estimate the parameters , their number, , and the corresponding scalar weights, . This problem is nonconvex, due to and .
ADCG combines interleaves steps from three algorithms: matching pursuit, constrained minimization (basis pursuit), and nonconvex optimization.
step 1: basis pursuit
The algorithm state is a list of estimated parameters, . At the begining of each iteration the algorithm estimates the corresponding weights by solving the following finite-dimensional convex problem:
Note that some of the resulting may be zero, in which case we can forget the corresponding parameters.
step 2: matching pursuit
In this step we find a single new parameter to add to our set of candidate parameters. If we let , the parameter we add is the solution to the following matching pursuit problem:
Note that this problem is nonconvex in , but does not grow in dimensionality with each iteration. In many applications, is low dimensional, and the matching pursuit problem is readily solved by gridding followed by gradient descent. In other applications, such as matrix completion, special structure in allows for efficient computation of the matching pursuit step.
step 3: nonconvex optimization
Starting with the weights and parameters from step 2, we do nonconvex optimization on the functional
keeping the constraints.
In our implementation we do block coordinate descent over and to take advantage of the convexity over , but this is a design choice. The main idea is to use the fact that is differentiable to adjust the parameters.
One can apply ADCG to a wide range of problems:
We give brief descriptions of these problems in the applications section of this website.
For a full explanation of ADCG, please read our preprint:
The Alternating Descent Conditional Gradient Method for Sparse Inverse Problems - N. Boyd, G. Schiebinger, B. Recht.
If you have any suggestions or have found ADCG useful in your work please let us know!