ED Mathématiques et Informatique
Fast non-convex algorithms for off-the-grid sparse signal recovery
by Pierre-Jean BENARD (IMB - Institut de Mathématiques de Bordeaux)
The defense will take place at 10h00 - Salle de conférences Institut de Mathématiques de Bordeaux (IMB) UMR 5251, A33 Université de Bordeaux 351, cours de la Libération - F 33 405 TALENCE
in front of the jury composed of
- Yann TRAONMILIN - Chargé de recherche - Institut de mathématiques de Bordeaux, IMB - Directeur de these
- Pierre WEISS - Directeur de recherche - Institut de Recherche en Informatique de Toulouse - Rapporteur
- Jérôme IDIER - Directeur de recherche - Laboratoire des Sciences du Numérique de Nantes - Rapporteur
- Jean-François AUJOL - Professeur - Institut de mathématiques de Bordeaux, IMB - CoDirecteur de these
- Caroline CHAUX - Directrice de recherche - IPAL, CNRS, International Research Lab - Examinateur
- Baudouin DENIS DE SENNEVILLE - Directeur de recherche - Institut de mathématiques de Bordeaux, IMB - Examinateur
- Irène WALDSPURGER - Chargée de recherche - CEREMADE, Université Paris-Dauphine - Examinateur
The recovery of sparse signals in infinite dimension is a classic problem in the field of inverse problems with many applications. Let $x_{0}$ be the unknown signal to recover, which we model as a sum of $K$ Dirac measures on $mathbb{R}^{d}$. Based on this model, the only information required to recover $x_{0}$ is its associated amplitudes $a$ and positions $t$. The observation $y$ corresponds to the acquisition of $m$ linear measurements. It can be modeled in the noise-free case by $y = A x_{0}$ with $A$ a linear acquisition operator. One way of recovering $x_{0}$ from $y$ and $A$ is to find the minimizer of a non-convex least-squares problem : $(mathcal{P}) quad x^{*} in argmin_{x in Sigma_{K, epsilon}} | y - Ax |_{2}^{2}$ with $Sigma_{K, epsilon}$ the set of signals from $K$ spikes with a separation constraint between spikes. Theoretical recovery guarantees have been given for acquisition operators $A$ with a restricted isometry property (RIP). This and other theoretical results partly explain the success of a greedy method : the Continuous Orthogonal Matching Pursuit (COMP) with descent algorithm. However, this method remains costly due to its multiple descents on all parameters. We then introduce the Over-Parametrized COMP algorithm with projected gradient descent (OP-COMP + PGD) to overcome this problem. Our OP-COMP + PGD method consists of two distinct parts. The first part (OP-COMP) is a modification of COMP with descent, in which we remove the multiple descents on all parameters. By over-parametrizing the signal, we compensate for the lack of descent. In the second part, PGD performs a single descent on all parameters. It also enforces a projection that merges spikes that are too close to each other. This ensures that the estimated signal remains within the solution set $Sigma_{K, epsilon}$. We accompany this new algorithm with a theoretical study on OP-COMP indicating that, for a linear operator $A$ following a RIP, the first $K$ iterations of OP-COMP initialize spikes close to those of $x_{0}$. With work on the size of the basins of attraction of $mathcal{P}$, we ensure that with a good initialization of the estimated signal, the latter lies in the basin of attraction of the global minimum and so OP-COMP + PGD can recover $x_{0}$ exactly. In terms of experimental results, we compare OP-COMP + PGD to COMP with descent on several configurations. In summary, the reconstruction quality of OP-COMP + PGD is identical to that of COMP with descent. Above $K geq 15$, OP-COMP becomes less costly, with a total computation time in the order of $mathcal{O}(K)$ compared to $mathcal{O}(K^{2})$ for COMP with descent. In addition to OP-COMP + PGD, we develop two other acceleration techniques. Firstly, block coordinate descent takes advantage of the sparsity of our model. This method seeks to reduce the computational cost of projected gradient descent. It replaces PGD by performing a descent only on spikes that have not yet converged. When comparing this method with PGD, we observe an improvement in total computation time of $35%$ with no loss of accuracy. Secondly, sketching the linear operator involves compressing the $y$ observation to reduce the total number of measurements and thus speed up computation. By sketching with the Fourier transform, we solve an approximation of $mathcal{P}$. The quality of the recovery then depends on the number of compressed measurements. When comparing this method with OP-COMP + PGD, we obtain comparable accuracy and a total computation time speed-up of $80%$ with $15%$ sampling.