Applications
superresolution imaging
neural spike identification
lidar
designing radiation therapy
matrix completion
linear system identification
Linear timeinvariant (LTI) dynamical systems describe the evolution of an output based on the input , where indexes time. The internal state at time of the system is parameterized by a vector , and its relationship to the output is described by
Here is a fixed matrix, while , and are unknown parameters.
The task of learning these parameters from input, output data can be posed as a sparse inverse problem as follows.
Each source is a small LTI system parameterized by where and both lie in the unit ball in , is in , and is in .
The LTI system that each source describes has
The mapping from the parameters to the output of the corresponding LTI system on input is differentiable. In terms of the overall LTI system, adding the output of two weighted sources corresponds to concatenating the corresponding parameters.
bayesian experimental design
In experimental design we seek to estimate a vector from measurements of the form
Here is a known differentiable feature function and are independent noise terms.
We want to choose to minimize our uncertainty about if each measurement requires a costly experiment, this corresponds to getting the most information from a fixed number of experiments.
In general, this task in intractable.
However, if we assume are independently distributed as standard normals and comes from a standard normal prior we can analytically derive the posterior distribution
of given ,
as the full joint distribution of is normal.
One notion of how much information carry about is the entropy of the posterior distribution of given the measurements.
We can then choose to minimize the entropy of the posterior, which is equivalent to minimizing the (log) volume of an uncertainty ellipsoid.
With this setup, the posterior entropy is (up to additive constants and a positive multiplicative factor) simply
To put this in our framework, we can take , and .
We relax the requirement to choose exactly measurement parameters and instead search for a sparse measure with bounded total mass, giving us a sparse inverse problem.
design of numerical quadrature rules
In many numerical computing applications we require fast procedures to approximate integration against a fixed measure. One way to do this is use a quadrature rule:
The quadrature rule, given by and , is chosen so that the above approximation holds for functions in a certain function class. The pairs are known as quadrature nodes. In practice, we want quadrature rules with very few nodes to speed evaluation of the rule.
Often we don't have an a priori description of the function class from which is chosen, but we might have a finite number of examples of functions in the class, , along with their integrals against , . In other words, we know that
A reasonable quadrature rule should approximate the integrals of the known well.
We can phrase this task as a sparse inverse problem where each source is a single quadrature node. In our notation, . Assuming each function is differentiable, is differentiable. A common choose of for this application is simply the squared loss.
Note that in this application there is no need to constraint the weights to be positive.
fitting mixture models
Given a parametric distribution we consider the task of recovering the components of a mixture model from i.i.d. samples.
To be more precise, we are given data sampled i.i.d. from a distribution of the form
.
The task is to recover the mixing distribution .
If we assume is sparse, we can phrase this as a sparse inverse problem.
To do so, we choose .
A common choice for is the (negative) loglikelihood of the data: i.e., ,
The obvious constraint is
