Competitive Classification and Closeness Testing

Jayadev Acharya; Hirakendu Das; Ashkan Jafarpour; Alon Orlitsky; Shengjun Pan; Ananda Suresh

2012 COLT COLT 2012

Competitive Classification and Closeness Testing

Abstract

We study the problems of \emphclassification and \emphcloseness testing. A \emphclassifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A \emphcloseness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are \emphsymmetric – they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most O(n^3/2) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω(n^7/6) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — closeness testing

🐣 Hot Topic Early Bird — sample complexity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

📈 Trend Setter — Probability

Authors

Jayadev Acharya , Hirakendu Das , Ashkan Jafarpour , Alon Orlitsky , Shengjun Pan , Ananda Suresh

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Statistical Learning Mathematics & Optimization > Mathematics > Probability Mathematics & Optimization > Statistics Mathematics & Optimization > Probability

Keywords

sample complexity distribution testing closeness testing symmetric algorithm

Download PDF

Related papers

Unsupervised SVMs: On the Complexity of the Furthest Hyperplane Problem 2012

Online Optimization with Gradual Variations 2012

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences 2012

Computational Bounds on Statistical Query Learning 2012

Rare Probability Estimation under Regularly Varying Heavy Tails 2012