2015 ICML ICML 2015

Faster cover trees

Abstract

The cover tree data structure speeds up exact nearest neighbor queries over arbitrary metric spaces. This paper makes cover trees even faster. In particular, we provide (1) a simpler definition of the cover tree that reduces the number of nodes from O(n) to exactly n, (2) an additional invariant that makes queries faster in practice, (3) algorithms for constructing and querying the tree in parallel on multiprocessor systems, and (4) a more cache efficient memory layout. On standard benchmark datasets, we reduce the number of distance computations by 10–50%. On a large-scale bioinformatics dataset, we reduce the number of distance computations by 71%. On a large-scale image dataset, our parallel algorithm with 16 cores reduces tree construction time from 3.5 hours to 12 minutes.

🧭 Keyword Pioneer — cache efficiency
🐣 Hot Topic Early Bird — nearest neighbor
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio