Hyper-Sparse Optimal Aggregation

Stephane Gaiffas; Guillaume Lecue

2011 JMLR JMLR 2011

Hyper-Sparse Optimal Aggregation

Abstract

Given a finite set F of functions and a learning sample, the aim of an aggregation procedure is to have a risk as close as possible to risk of the best function in F. Up to now, optimal aggregation procedures are convex combinations of every elements of F. In this paper, we prove that optimal aggregation procedures combining only two functions in F exist. Such algorithms are of particular interest when F contains many irrelevant functions that should not appear in the aggregation procedure. Since selectors are suboptimal aggregation procedures, this proves that two is the minimal number of elements of F required for the construction of an optimal aggregation procedure in every situations. Then, we perform a numerical study for the problem of selection of the regularization parameters of the Lasso and the Elastic-net estimators. We compare on simulated examples our aggregation algorithms to aggregation with exponential weights, to Mallow's Cp and to cross-validation selection procedures. [abs] [ pdf ][ bib ] © JMLR 2011. (edit, beta)

🧭 Keyword Pioneer — optimal aggregation

🐣 Hot Topic Early Bird — model selection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Stephane Gaiffas , Guillaume Lecue

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Statistical Learning

Keywords

model selection lasso regression regularization parameter exponential weight optimal aggregation

Download PDF

Related papers

MSVMpack: A Multi-Class Support Vector Machine Package 2011

Multitask Sparsity via Maximum Entropy Discrimination 2011

Training SVMs Without Offset 2011

Logistic Stick-Breaking Process 2011

Learning Multi-modal Similarity 2011