2003 JMLR JMLR 2003

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

Abstract

This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem. The concentration inequality for histogram rule error is motivated by the desire to construct a new class of bounds on the generalization error of decision trees. [abs] [ pdf ][ ps.gz ][ ps ]

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
📈 Trend Setter — Statistical Learning
🧭 Keyword Pioneer — rademacher complexity
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Reinforcement Learning