Fast and Memory Optimal Low-Rank Matrix Approximation

Se-Young Yun; marc lelarge; Alexandre Proutiere

2015 NIPS NeurIPS 2015

Fast and Memory Optimal Low-Rank Matrix Approximation

Abstract

In this paper, we revisit the problem of constructing a near-optimal rank $k$ approximation of a matrix $M\in [0,1]^{m\times n}$ under the streaming data model where the columns of $M$ are revealed sequentially. We present SLA (Streaming Low-rank Approximation), an algorithm that is asymptotically accurate, when $k s_{k+1} (M) = o(\sqrt{mn})$ where $s_{k+1}(M)$ is the $(k+1)$-th largest singular value of $M$. This means that its average mean-square error converges to 0 as $m$ and $n$ grow large (i.e., $\|\hat{M}^{(k)}-M^{(k)} \|_F^2 = o(mn)$ with high probability, where $\hat{M}^{(k)}$ and $M^{(k)}$ denote the output of SLA and the optimal rank $k$ approximation of $M$, respectively). Our algorithm makes one pass on the data if the columns of $M$ are revealed in a random order, and two passes if the columns of $M$ arrive in an arbitrary order. To reduce its memory footprint and complexity, SLA uses random sparsification, and samples each entry of $M$ with a small probability $\delta$. In turn, SLA is memory optimal as its required memory space scales as $k(m+n)$, the dimension of its output. Furthermore, SLA is computationally efficient as it runs in $O(\delta kmn)$ time (a constant number of operations is made for each observed entry of $M$), which can be as small as $O(k\log(m)^4 n)$ for an appropriate choice of $\delta$ and if $n\ge m$.

🌉 Interdisciplinary Bridge — Data Science & Analytics and Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — random sparsification

🐣 Hot Topic Early Bird — singular value decomposition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Se-Young Yun , marc lelarge , Alexandre Proutiere

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Application Areas > Efficient Computing Data Science & Analytics > Methods > Data Mining Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Core Methods > Matrix Factorization Machine Learning > Optimization & Theory > Efficient Computing

Keywords

matrix factorization low-rank approximation random sampling singular value decomposition matrix completion streaming algorithm singular value low-rank matrix approximation random sparsification memory optimal

Download PDF

Related papers

Data Generation as Sequential Decision Making 2015

A Recurrent Latent Variable Model for Sequential Data 2015

Combinatorial Cascading Bandits 2015

Accelerated Mirror Descent in Continuous and Discrete Time 2015

Matrix Completion with Noisy Side Information 2015