HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task

Hengchao Shang; Ting Hu; Daimeng Wei; Zongyao Li; Jianfei Feng; Zhengzhe Yu; Jiaxin Guo; Shaojun Li; Lizhi Lei; Shimin Tao; Hao Yang; Jun Yao; Ying Qin

2021 EMNLP EMNLP 2021

HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task

Abstract

AbstractThis paper presents the submission of Huawei Translation Services Center (HW-TSC) to WMT 2021 Efficiency Shared Task. We explore the sentence-level teacher-student distillation technique and train several small-size models that find a balance between efficiency and quality. Our models feature deep encoder, shallow decoder and light-weight RNN with SSRU layer. We use Huawei Noah’s Bolt, an efficient and light-weight library for on-device inference. Leveraging INT8 quantization, self-defined General Matrix Multiplication (GEMM) operator, shortlist, greedy search and caching, we submit four small-size and efficient translation models with high translation quality for the one CPU core latency track.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — inference optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hengchao Shang , Ting Hu , Daimeng Wei , Zongyao Li , Jianfei Feng , Zhengzhe Yu , Jiaxin Guo , Shaojun Li , Lizhi Lei , Shimin Tao , Hao Yang , Jun Yao , Ying Qin

Topics

Natural Language Processing > Applications > Machine Translation Machine Learning > Application Areas > Model Compression Natural Language Processing > Generation > Machine Translation Machine Learning > Learning Types > Knowledge Distillation Deep Learning > Techniques > Knowledge Distillation Deep Learning > Learning Types > Knowledge Distillation Deep Learning > Optimization & Theory > Efficient Computing

Keywords

model compression model quantization knowledge distillation neural machine translation inference optimization lightweight model lightweight transformer shallow decoder

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021