2012 NIPS NeurIPS 2012

Entropy Estimations Using Correlated Symmetric Stable Random Projections

Abstract

Methods for efficiently estimating the Shannon entropy of data streams have important applications in learning, data mining, and network anomaly detections (e.g., the DDoS attacks). For nonnegative data streams, the method of Compressed Counting (CC) based on maximally-skewed stable random projections can provide accurate estimates of the Shannon entropy using small storage. However, CC is no longer applicable when entries of data streams can be below zero, which is a common scenario when comparing two streams. In this paper, we propose an algorithm for entropy estimation in general data streams which allow negative entries. In our method, the Shannon entropy is approximated by the finite difference of two correlated frequency moments estimated from correlated samples of symmetric stable random variables. Our experiments confirm that this method is able to substantially better approximate the Shannon entropy compared to the prior state-of-the-art.

🌉 Interdisciplinary Bridge — Data Science & Analytics and Machine Learning and Mathematics & Optimization
📈 Trend Setter — Data Mining
🧭 Keyword Pioneer — shannon entropy estimation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Security & Privacy
🐣 Hot Topic Early Bird — anomaly detection