2024 INTERSPEECH INTERSPEECH 2024

Stress transfer in speech-to-speech machine translation

Abstract

India’s education sector faces a significant challenge due to its linguistic diversity, hindering inclusivity. The dominance of English on the internet underscores the critical need for translating educational content into Indian languages to enhance accessibility. Although Speech-to-Speech Machine Translation (SSMT) technologies exist, their deficiency in reproducing intonation results in monotonous translations, diminishing audience engagement and interest in the content. To address this issue, this paper demonstrates an SSMT application with a Text-to-Speech (TTS) architecture capable of incorporating stress into synthesized speech to give a more engaging experience. The SSMT pipeline also has components like a stress classifier that captures the stress in the source speech and allows it to be utilized during speech generation. The application takes in a speech file and gives a translated speech file with stress transferred from the source.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing
🧭 Keyword Pioneer — speech-to-speech machine translation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio