Hope and Fear for Discriminative Training of Statistical Translation Models

David Chiang

2012 JMLR JMLR 2012

Hope and Fear for Discriminative Training of Statistical Translation Models

Abstract

In machine translation, discriminative models have almost entirely supplanted the classical noisy-channel model, but are standardly trained using a method that is reliable only in low-dimensional spaces. Two strands of research have tried to adapt more scalable discriminative training methods to machine translation: the first uses log-linear probability models and either maximum likelihood or minimum risk, and the other uses linear models and large-margin methods. Here, we provide an overview of the latter. We compare several learning algorithms and describe in detail some novel extensions suited to properties of the translation task: no single correct output, a large space of structured outputs, and slow inference. We present experimental results on a large-scale Arabic-English translation task, demonstrating large gains in translation accuracy. [abs] [ pdf ][ bib ] © JMLR 2012. (edit, beta)

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Machine Translation

🐣 Hot Topic Early Bird — statistical machine translation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

Authors

David Chiang

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Machine Translation Machine Learning > Learning Types > Classification

Keywords

statistical machine translation discriminative training large-margin methods structured output learning large-margin learning log-linear model

Download PDF

Related papers

Plug-in Approach to Active Learning 2012

An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity 2012

Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks 2012

GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression 2012

Query Strategies for Evading Convex-Inducing Classifiers 2012