HW-TSC’s Simultaneous Speech Translation System for IWSLT 2024

Shaojun Li; Zhiqiang Rao; Bin Wei; Yuanchang Luo; Zhanglin Wu; Zongyao Li; Hengchao Shang; Jiaxin Guo; Daimeng Wei; Hao Yang

2024 ACL ACL 2024

HW-TSC’s Simultaneous Speech Translation System for IWSLT 2024

Abstract

AbstractThis paper outlines our submission for the IWSLT 2024 Simultaneous Speech-to-Text (SimulS2T) and Speech-to-Speech (SimulS2S) Translation competition. We have engaged in all four language directions and both the SimulS2T and SimulS2S tracks: English-German (EN-DE), English-Chinese (EN-ZH), English-Japanese (EN-JA), and Czech-English (CS-EN). For the S2T track, we have built upon our previous year’s system and further honed the cascade system composed of ASR model and MT model. Concurrently, we have introduced an end-to-end system specifically for the CS-EN direction. This end-to-end (E2E) system primarily employs the pre-trained seamlessM4T model. In relation to the SimulS2S track, we have integrated a novel TTS model into our SimulS2T system. The final submission for the S2T directions of EN-DE, EN-ZH, and EN-JA has been refined over our championship system from last year. Building upon this foundation, the incorporation of the new TTS into our SimulS2S system has resulted in the ASR-BLEU surpassing last year’s best score.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Shaojun Li , Zhiqiang Rao , Bin Wei , Yuanchang Luo , Zhanglin Wu , Zongyao Li , Hengchao Shang , Jiaxin Guo , Daimeng Wei , Hao Yang

Topics

Machine Learning > Optimization & Theory > Optimization

Keywords

automatic speech recognition speech-to-speech translation end-to-end translation simultaneous translation speech-to-text translation

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024