2024 INTERSPEECH INTERSPEECH 2024

Learning from memory-based models

Abstract

Recent work on the CPC2 speech intelligibility task shows promising results using an architecture inspired by memory models from the field of human psychology. This is surprising, given that previous work has shown memory models of this type to be inferior to parametric models, such as transformers, in most modern applications. This paper shows that the difference in performance is reduced or eliminated by using high quality features. Furthermore, we show for the first time that, despite being widely used in the field of human psychology and also for speech and language tasks, this model is a special case of a neural network. Experimental results from different tasks and datasets (CPC2/TIMIT/GoEmotions) confirm that this type of memory model is competitive with equivalently complex parametric models given sufficiently good feature representation, suggesting that high quality features may allow the use of simple, interpretable models without sacrificing performance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio