2002 JMLR JMLR 2002

Minimal Kernel Classifiers

Abstract

A finite concave minimization algorithm is proposed for constructing kernel classifiers that use a minimal number of data points both in generating and characterizing a classifier. The algorithm is theoretically justified on the basis of linear programming perturbation theory and a leave-one-out error bound as well as effective computational results on seven real world datasets. A nonlinear rectangular kernel is generated by systematically utilizing as few of the data as possible both in training and in characterizing a nonlinear separating surface. This can result in substantial reduction in kernel data-dependence (over 94% in six of the seven public datasets tested on) and with test set correctness equal to that obtained by using a conventional support vector machine classifier that depends on many more data points. This reduction in data dependence results in a much faster classifier that requires less storage. To eliminate data points, the proposed approach makes use of a novel loss function, the "pound" function () # , which is a linear combination of the 1-norm and the step function that measures both the magnitude and the presence of any error. [abs] [pdf] [ps.gz] [ps]

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
📈 Trend Setter — Optimization
🧭 Keyword Pioneer — linear programming
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
🐣 Hot Topic Early Bird — feature selection