How Good Are Inducing Points for Dataset Distillation? (Student Abstract)

Shrutimoy Das

2026 AAAI AAAI 2026

How Good Are Inducing Points for Dataset Distillation? (Student Abstract)

Abstract

Abstract Dataset distillation methods learn a representative summary of the full dataset such that training on the distilled data is more efficient in terms of time and space. The current state-of-the-art methods exploit the correspondence between infinitely wide neural networks (NNs) and kernel ridge regression to design distillation methods that result in high-quality summaries of the data. In this work, we leverage the correspondence between infinitely wide networks and Gaussian Processes(GPs) for learning a distilled dataset. We investigate the feasibility of using the inducing points method for Gaussian Processes, as a data distillation method. While most of the existing dataset distillation methods are based on loss or gradient matching, our method looks at the function space approximation, facilitated by the NN-GP correspondence. Additionally, using recent theoretical results on GP regression and neural tangent kernels(NTKs), we also provide an upper bound on the size of the distilled data. We demonstrate the utility of inducing points as distilled data on a set of datasets empirically.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shrutimoy Das

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Deep Learning > Models > Generative Models

Keywords

neural tangent kernel gaussian process kernel ridge regression dataset distillation inducing point infinite width network

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026