Consistency is Key: On Data-Efficient Modality Transfer in Speech Translation

Hojin Lee; Changmin Lee; Seung-won Hwang

2023 EMNLP EMNLP 2023

Consistency is Key: On Data-Efficient Modality Transfer in Speech Translation

Abstract

AbstractEnd-to-end approaches have shown promising results for speech translation (ST), but they suffer from its data scarcity compared to machine translation (MT). To address this, progressive training has become a common practice, of using external MT data during the fine-tuning phase. Despite of its prevalence and computational overhead, its validity is not extensively corroborated yet. This paper conducts an empirical investigation and finds that progressive training is ineffective. We identify learning-forgetting trade-off as a critical obstacle, then hypothesize and verify that consistency learning (CL) breaks the dilemma of learning-forgetting. The proposed method, which combines knowledge distillation (KD) and CL, outperforms the previous methods on MuST-C dataset even without additional data, and our proposed consistency-informed KD achieves additional improvements against KD+CL. Code and models are availble at https://github.com/hjlee1371/consistency-s2tt.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing and Speech & Audio

🧭 Keyword Pioneer — learning-forgetting trade-off

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hojin Lee , Changmin Lee , Seung-won Hwang

Topics

Machine Learning > Application Areas > Knowledge Distillation Deep Learning > Models > Diffusion Models Natural Language Processing > Applications > Machine Translation Speech & Audio > Recognition > Speech Recognition Machine Learning > Learning Types > Transfer Learning Deep Learning > Techniques > Knowledge Distillation Speech & Audio > Recognition > Speech Translation

Keywords

knowledge distillation end-to-end learning progressive training consistency learning speech translation modality transfer learning-forgetting trade-off

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023