2021
EMNLP
EMNLP 2021
Hyperparameter Power Impact in Transformer Language Model Training
Abstract
AbstractTraining large language models can consume a large amount of energy. We hypothesize that the language model’s configuration impacts its energy consumption, and that there is room for power consumption optimisation in modern large language models. To investigate these claims, we introduce a power consumption factor to the objective function, and explore the range of models and hyperparameter configurations that affect power. We identify multiple configuration factors that can reduce power consumption during language model training while retaining model quality.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Optimization & Theory > Neural Network Optimization
Machine Learning > Application Areas > Efficient Computing
Deep Learning > Architectures > Transformers
Natural Language Processing > Resources & Methods > Large Language Models
Artificial Intelligence > Core AI > Large Language Models
Deep Learning > Optimization & Theory > Optimization
Deep Learning > Optimization & Theory > Efficient Computing
Deep Learning > Application Areas > Efficient Computing
Machine Learning > Learning Types > Efficient Computing