Learning from memory-based models

Rhiannon Mogridge; Anton Ragni

2024 INTERSPEECH INTERSPEECH 2024

Learning from memory-based models

Abstract

Recent work on the CPC2 speech intelligibility task shows promising results using an architecture inspired by memory models from the field of human psychology. This is surprising, given that previous work has shown memory models of this type to be inferior to parametric models, such as transformers, in most modern applications. This paper shows that the difference in performance is reduced or eliminated by using high quality features. Furthermore, we show for the first time that, despite being widely used in the field of human psychology and also for speech and language tasks, this model is a special case of a neural network. Experimental results from different tasks and datasets (CPC2/TIMIT/GoEmotions) confirm that this type of memory model is competitive with equivalently complex parametric models given sufficiently good feature representation, suggesting that high quality features may allow the use of simple, interpretable models without sacrificing performance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Rhiannon Mogridge , Anton Ragni

Topics

Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Neural Networks

Keywords

feature representation speech intelligibility parametric model memory-based model

Download PDF

Related papers

Reshape Dimensions Network for Speaker Recognition 2024

RevRIR: Joint Reverberant Speech and Room Impulse Response Embedding using Contrastive Learning with Application to Room Shape Classification 2024

Mixed Children/Adult/Childrenized Fine-Tuning for Children’s ASR: How to Reduce Age Mismatch and Speaking Style Mismatch 2024

Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions 2024

K-means and hierarchical clustering of f0 contours 2024