A Convergence Analysis of Log-Linear Training

Simon Wiesler; Hermann Ney

2011 NIPS NeurIPS 2011

A Convergence Analysis of Log-Linear Training

Abstract

Log-linear models are widely used probability models for statistical pattern recognition. Typically, log-linear models are trained according to a convex criterion. In recent years, the interest in log-linear models has greatly increased. The optimization of log-linear model parameters is costly and therefore an important topic, in particular for large-scale applications. Different optimization algorithms have been evaluated empirically in many papers. In this work, we analyze the optimization problem analytically and show that the training of log-linear models can be highly ill-conditioned. We verify our findings on two handwriting tasks. By making use of our convergence analysis, we obtain good results on a large-scale continuous handwriting recognition task with a simple and generic approach.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning

📈 Trend Setter — Neural Network Optimization

🧭 Keyword Pioneer — ill-conditioned optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

🐣 Hot Topic Early Bird — convergence analysis

Authors

Simon Wiesler , Hermann Ney

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Computer Vision > Analysis > Action Recognition Mathematics & Optimization > Optimization > Optimization

Keywords

convex optimization convergence analysis handwriting recognition parameter optimization ill-conditioned optimization log-linear model statistical pattern recognition

Download PDF

Related papers

Co-Training for Domain Adaptation 2011

The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning 2011

Learning to Agglomerate Superpixel Hierarchies 2011

A Reinforcement Learning Theory for Homeostatic Regulation 2011

A Global Structural EM Algorithm for a Model of Cancer Progression 2011