A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification

Guo-Xun Yuan; Kai-Wei Chang; Cho-jui Hsieh; Chih-Jen Lin

2010 JMLR JMLR 2010

A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification

Abstract

Large-scale linear classification is widely used in many areas. The L1-regularized form can be applied for feature selection; however, its non-differentiability causes more difficulties in training. Although various optimization methods have been proposed in recent years, these have not yet been compared suitably. In this paper, we first broadly review existing methods. Then, we discuss state-of-the-art software packages in detail and propose two efficient implementations. Extensive comparisons indicate that carefully implemented coordinate descent methods are very suitable for training large document data. [abs] [ pdf ][ bib ] © JMLR 2010. (edit, beta)

🐣 Hot Topic Early Bird — feature selection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Guo-Xun Yuan , Kai-Wei Chang , Cho-jui Hsieh , Chih-Jen Lin

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Optimization

Keywords

feature selection l1 regularization large-scale classification document classification coordinate descent linear classification

Download PDF

Related papers

A Fast Hybrid Algorithm for Large-Scale -Regularized Logistic Regression 2010

Model-based Boosting 2.0 2010

On Learning with Integral Operators 2010

Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data 2010

Hilbert Space Embeddings and Metrics on Probability Measures 2010