2011 COLT COLT 2011

Competitive Closeness Testing

Abstract

We test whether two sequences are generated by the same distribution or by two different ones. Unlike previous work, we make no assumptions on the distributions’ support size. Additionally, we compare our performance to that of the best possible test. We describe an efficiently-computable algorithm based on pattern maximum likelihood that is near optimal whenever the best possible error probability is $\le\exp(-14n^{2/3})$ using length-$n$ sequences.

🚀 Conference Pioneer — COLT 2011
🧭 Keyword Pioneer — error probability
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing
🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization
📈 Trend Setter — Statistics
🐣 Hot Topic Early Bird — information theory