On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization

Ke Hou; Zirui Zhou; Anthony Man-Cho So; Zhi-Quan Luo

2013 NIPS NeurIPS 2013

On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization

Abstract

Motivated by various applications in machine learning, the problem of minimizing a convex smooth loss function with trace norm regularization has received much attention lately. Currently, a popular method for solving such problem is the proximal gradient method (PGM), which is known to have a sublinear rate of convergence. In this paper, we show that for a large class of loss functions, the convergence rate of the PGM is in fact linear. Our result is established without any strong convexity assumption on the loss function. A key ingredient in our proof is a new Lipschitzian error bound for the aforementioned trace norm-regularized problem, which may be of independent interest.

🧭 Keyword Pioneer — lipschitzian error bound

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization

🐣 Hot Topic Early Bird — linear convergence

Authors

Ke Hou , Zirui Zhou , Anthony Man-Cho So , Zhi-Quan Luo

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Optimization Mathematics & Optimization > Optimization > Convex Optimization

Keywords

convex optimization matrix learning matrix completion trace norm regularization proximal gradient method linear convergence lipschitzian error bound

Download PDF

Related papers

Latent Structured Active Learning 2013

On Flat versus Hierarchical Classification in Large-Scale Taxonomies 2013

Generalized Method-of-Moments for Rank Aggregation 2013

Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections 2013

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent 2013