Variance-Aware Estimation of Kernel Mean Embedding

Geoﬀrey Wolfer; Pierre Alquier

2025 JMLR JMLR 2025

Variance-Aware Estimation of Kernel Mean Embedding

Abstract

An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded independently of the dimension of the space, properties of the distribution and smoothness features of the kernel. We show how to speed-up convergence by leveraging variance information in the reproducing kernel Hilbert space. Furthermore, we show that even when such information is a priori unknown, we can efficiently estimate it from the data, recovering the desiderata of a distribution agnostic bound that enjoys acceleration in fortuitous settings. We further extend our results from independent data to stationary mixing sequences and illustrate our methods in the context of hypothesis testing and robust parametric estimation. [abs] [ pdf ][ bib ] © JMLR 2025. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — stationary mixing sequence

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Geoﬀrey Wolfer , Pierre Alquier

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Stochastic Processes Mathematics & Optimization > Mathematics > Probability Machine Learning > Optimization & Theory > Statistics Machine Learning > Core Methods > Kernel Methods

Keywords

hypothesis testing reproducing kernel hilbert space variance estimation kernel mean embedding stationary mixing sequence

Download PDF

Related papers

On the Natural Gradient of the Evidence Lower Bound 2025

Four Axiomatic Characterizations of the Integrated Gradients Attribution Method 2025

Extending Temperature Scaling with Homogenizing Maps 2025

Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python 2025

An Axiomatic Definition of Hierarchical Clustering 2025