On-the-fly Operation Batching in Dynamic Computation Graphs

Graham Neubig; Yoav Goldberg; Chris Dyer

2017 NIPS NeurIPS 2017

On-the-fly Operation Batching in Dynamic Computation Graphs

Abstract

Dynamic neural networks toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e.g., TensorFlow, CNTK, and Theano). However, existing toolkits - both static and dynamic - require that the developer organize the computations into the batches necessary for exploiting high-performance data-parallel algorithms and hardware. This batching task is generally difficult, but it becomes a major hurdle as architectures become complex. In this paper, we present an algorithm, and its implementation in the DyNet toolkit, for automatically batching operations. Developers simply write minibatch computations as aggregations of single instance computations, and the batching algorithm seamlessly executes them, on the fly, in computationally efficient batches. On a variety of tasks, we obtain throughput similar to manual batches, as well as comparable speedups over single-instance learning on architectures that are impractical to batch manually.

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Machine Learning

🧭 Keyword Pioneer — dynamic computation graph

🐣 Hot Topic Early Bird — computational efficiency

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Graham Neubig , Yoav Goldberg , Chris Dyer

Topics

Machine Learning > Application Areas > Efficient Computing Deep Learning > Architectures > Neural Networks Computer Science > Systems > Distributed Systems Deep Learning > Optimization & Theory > Efficient Computing

Keywords

neural network optimization computational efficiency throughput optimization data parallelism dynamic computation graph automatic batching neural network operation batching

Download PDF

Related papers

High-Order Attention Models for Visual Question Answering 2017

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization 2017

Premise Selection for Theorem Proving by Deep Graph Embedding 2017

Neural Program Meta-Induction 2017

Safe and Nested Subgame Solving for Imperfect-Information Games 2017