Theory of matching pursuit

Zakria Hussain; John S. Shawe-taylor

2008 NIPS NeurIPS 2008

Theory of matching pursuit

Abstract

We analyse matching pursuit for kernel principal components analysis by proving that the sparse subspace it produces is a sample compression scheme. We show that this bound is tighter than the KPCA bound of Shawe-Taylor et al swck-05 and highly predictive of the size of the subspace needed to capture most of the variance in the data. We analyse a second matching pursuit algorithm called kernel matching pursuit (KMP) which does not correspond to a sample compression scheme. However, we give a novel bound that views the choice of subspace of the KMP algorithm as a compression scheme and hence provide a VC bound to upper bound its future loss. Finally we describe how the same bound can be applied to other matching pursuit related algorithms.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — sample compression

🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization

🐣 Hot Topic Early Bird — feature learning

Authors

Zakria Hussain , John S. Shawe-taylor

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Theory Mathematics & Optimization > Mathematics > Statistics Machine Learning > Core Methods > Dimensionality Reduction Machine Learning > Core Methods > Feature Learning Machine Learning > Core Methods > Kernel Methods

Keywords

feature learning subspace learning kernel principal component analysis vc dimension kernel pca matching pursuit sample compression kernel principal components analysis sample compression scheme vc bound

Download PDF

Related papers

On the Efficient Minimization of Classification Calibrated Surrogates 2008

Hebbian Learning of Bayes Optimal Decisions 2008

Biasing Approximate Dynamic Programming with a Lower Discount Factor 2008

Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation 2008

Domain Adaptation with Multiple Sources 2008