Mixability is Bayes Risk Curvature Relative to Log Loss

Tim Erven; Mark D. Reid; Robert C. Williamson

2011 COLT COLT 2011

Mixability is Bayes Risk Curvature Relative to Log Loss

Abstract

Mixability of a loss governs the best possible performance when aggregating expert predictions with respect to that loss. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses. We then extend this result to multiclass proper losses where there are few existing results. We show that mixability is governed by the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.

🚀 Conference Pioneer — COLT 2011

🐣 Hot Topic Early Bird — multiclass classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Tim Erven , Mark D. Reid , Robert C. Williamson

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Optimization & Theory > Loss Functions Machine Learning > Optimization & Theory > Statistical Learning

Keywords

multiclass classification log loss proper loss bayes risk multiclass loss

Download PDF

Related papers

Competitive Closeness Testing 2011

Bandits, Query Learning, and the Haystack Dimension 2011

Minimax Policies for Combinatorial Prediction Games 2011

Sample Complexity Bounds for Differentially Private Learning 2011

Multiclass Learnability and the ERM principle 2011