Robust Sparse Regression with Non-Isotropic Designs
Abstract
We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive.Consider the model $y^*=X^*\beta^*+ \eta$ where $X^*$ is an $n\times d$ random design matrix, $\beta^*\in \mathbb{R}^d$ is a $k$-sparse vector, and the noise $\eta$ is independent of $X^*$ and chosen by the \emph{oblivious adversary}. Apart from the independence of $X^*$, we only require a small fraction entries of $\eta$ to have magnitude at most $1$. The \emph{adaptive adversary} is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples $(X_1^*, y_1^*),\ldots, (X_n^*, y_n^*)$.Given the $\varepsilon$-corrupted samples $(X_1, y_1),\ldots, (X_n, y_n)$, the goal is to estimate $\beta^*$. We assume that the rows of $X^*$ are iid samples from some $d$-dimensional distribution $\mathcal{D}$ with zero mean and (unknown) covariance matrix $\Sigma$ with bounded condition number.We design several robust algorithms that outperform the state of the art even in the special case of Gaussian noise $\eta \sim N(0,1)^n$. In particular, we provide a polynomial-time algorithm that with high probability recovers $\beta^*$ up to error $O(\sqrt{\varepsilon})$ as long as $n \ge \tilde{O}(k^2/\varepsilon)$, only assuming some bounds on the third and the fourth moments of $\mathcal{D}$. In addition, prior to this work, even in the special case of Gaussian design $\mathcal{D} = N(0,\Sigma)$ and noise $\eta \sim N(0,1)$, no polynomial time algorithm was known to achieve error $o(\sqrt{\varepsilon})$ in the sparse setting $n < d^2$. We show that under some assumptions on the fourth and the eighth moments of $\mathcal{D}$, there is a polynomial-time algorithm that achieves error $o(\sqrt{\varepsilon})$ as long as $n \ge \tilde{O}(k^4 / \varepsilon^3)$. For Gaussian distribution $\mathcal{D} = N(0,\Sigma)$, this algorithm achieves error $O(\varepsilon^{3/4})$. Moreover, our algorithm achieves error $o(\sqrt{\vare