Pieces of Eight: 8-bit Neural Machine Translation

Jerry Quinn; Miguel Ballesteros

2018 NAACL NAACL 2018

Pieces of Eight: 8-bit Neural Machine Translation

Abstract

AbstractNeural machine translation has achieved levels of fluency and adequacy that would have been surprising a short time ago. Output quality is extremely relevant for industry purposes, however it is equally important to produce results in the shortest time possible, mainly for latency-sensitive applications and to control cloud hosting costs. In this paper we show the effectiveness of translating with 8-bit quantization for models that have been trained using 32-bit floating point values. Results show that 8-bit translation makes a non-negligible impact in terms of speed with no degradation in accuracy and adequacy.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

📈 Trend Setter — Efficient Computing

🧭 Keyword Pioneer — inference speed

🐣 Hot Topic Early Bird — model quantization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jerry Quinn , Miguel Ballesteros

Topics

Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation Artificial Intelligence > Core AI > Efficient Computing Deep Learning > Optimization & Theory > Model Compression

Keywords

model compression model quantization neural machine translation inference efficiency inference speed integer arithmetic 8-bit quantization

Download PDF

Related papers

A Melody-Conditioned Lyrics Language Model 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation 2018

Automated Essay Scoring in the Presence of Biased Ratings 2018

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input 2018

QuickEdit: Editing Text & Translations by Crossing Words Out 2018