2006
NIPS
NeurIPS 2006
Computation of Similarity Measures for Sequential Data using Generalized Suffix Trees
Abstract
We propose a generic algorithm for computation of similarity measures for se- quential data. The algorithm uses generalized suffix trees for efficient calculation of various kernel, distance and non-metric similarity functions. Its worst-case run-time is linear in the length of sequences and independent of the underlying embedding language, which can cover words, k-grams or all contained subse- quences. Experiments with network intrusion detection, DNA analysis and text processing applications demonstrate the utility of distances and similarity coeffi- cients for sequences as alternatives to classical kernel functions.
🚀
Conference Pioneer
— NIPS 2006
🌉
Interdisciplinary Bridge
— Computer Science and Data Science & Analytics
📈
Trend Setter
— Data Mining
🧭
Keyword Pioneer
— sequential data
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics