2024 INTERSPEECH INTERSPEECH 2024

Post-Net: A linguistically inspired sequence-dependent transformed neural architecture for automatic syllable stress detection

Abstract

Automatic syllable stress detection methods typically consider syllable-level features as independent. However, as per linguistic studies, there is a dependency among the syllables within a word. In this work, we address this issue by proposing a Post-Net approach using Time-Delay Neural Networks to exploit the syllable dependency in a word for stress detection task. For this, we propose a loss function to incorporate the dependency by ensuring only one stressed syllable in a word. The proposed Post-Net leverages the existing SOTA sequence-independent stress detection models and learns in both supervised and unsupervised settings. We compare the Post-Net with three existing SOTA sequence-independent models and also with sequential model (LSTMs). Experiments conducted on ISLE corpus show the highest relative accuracy improvement of 2.1% and 20.28% with the proposed Post-Net compared to the best sequence-independent SOTA model in supervised and unsupervised manners, respectively.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio