2004 JMLR JMLR 2004

Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces

Abstract

We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X , we treat the problem of dimensionality reduction as that of finding a low-dimensional "effective subspace" for X which retains the statistical relationship between X and Y . We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem we establish a general nonparametric characterization of conditional independence using covariance operators on reproducing kernel Hilbert spaces. This characterization allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods for dimensionality reduction in supervised learning, the proposed method requires neither assumptions on the marginal distribution of X , nor a parametric model of the conditional distribution of Y . We present experiments that compare the performance of the method with conventional methods. [abs] [ pdf ][ ps.gz ][ ps ] erratum: [ pdf ]