Efficient multiple hyperparameter learning for log-linear models

Chuong B. Do; Andrew Y. Ng; Chuan Sheng Foo; Chuan-Sheng Foo

2007 NIPS NeurIPS 2007

Efficient multiple hyperparameter learning for log-linear models

Abstract

Using multiple regularization hyperparameters is an effective method for managing model complexity in problems where input features have varying amounts of noise. While algorithms for choosing multiple hyperparameters are often used in neural networks and support vector machines, they are not common in structured prediction tasks, such as sequence labeling or parsing. In this paper, we consider the problem of learning regularization hyperparameters for log-linear models, a class of probabilistic models for structured prediction tasks which includes conditional random fields (CRFs). Using an implicit differentiation trick, we derive an efficient gradient-based method for learning Gaussian regularization priors with multiple hyperparameters. In both simulations and the real-world task of computational RNA secondary structure prediction, we find that multiple hyperparameter learning provides a significant boost in accuracy compared to models learned using only a single regularization hyperparameter.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — regularization prior

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Structured Prediction

🐣 Hot Topic Early Bird — structured prediction

Authors

Chuan Sheng Foo , Chuan-Sheng Foo , Chuong B. Do , Andrew Y. Ng

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Optimization Machine Learning > Learning Types > Supervised Learning Machine Learning > Core Methods > Optimization Machine Learning > Core Methods > Structured Prediction Machine Learning > Learning Types > Hyperparameter Optimization

Keywords

structured prediction hyperparameter learning hyperparameter optimization gradient-based optimization regularization prior gradient-based method gradient optimization log-linear model conditional random field

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007