Learning curves for multi-task Gaussian process regression

Peter Sollich; Simon Ashton

2012 NIPS NeurIPS 2012

Learning curves for multi-task Gaussian process regression

Abstract

We study the average case performance of multi-task Gaussian process (GP) regression as captured in the learning curve, i.e.\ the average Bayes error for a chosen task versus the total number of examples $n$ for all tasks. For GP covariances that are the product of an input-dependent covariance function and a free-form inter-task covariance matrix, we show that accurate approximations for the learning curve can be obtained for an arbitrary number of tasks $T$. We use these to study the asymptotic learning behaviour for large $n$. Surprisingly, multi-task learning can be asymptotically essentially useless: examples from other tasks only help when the degree of inter-task correlation, $\rho$, is near its maximal value $\rho=1$. This effect is most extreme for learning of smooth target functions as described by e.g.\ squared exponential kernels. We also demonstrate that when learning {\em many} tasks, the learning curves separate into an initial phase, where the Bayes error on each task is reduced down to a plateau value by ``collective learning'' even though most tasks have not seen examples, and a final decay that occurs only once the number of examples is proportional to the number of tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — inter-task correlation

🐣 Hot Topic Early Bird — multi-task learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Peter Sollich , Simon Ashton

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Learning Types > Multi-Task Learning Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Paradigms > Multi-Task Learning Machine Learning > Bayesian & Probabilistic > Gaussian Processes

Keywords

multi-task learning bayesian inference gaussian process gaussian process regression bayesian regression bayesian optimization inter-task correlation learning curve kernel methods

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012