On Model Parallelization and Scheduling Strategies for Distributed Machine Learning

Seunghak Lee; Jin Kyu Kim; Xun Zheng; Qirong Ho; Garth A Gibson; Eric P. Xing; Eric P Xing

2014 NIPS NeurIPS 2014

On Model Parallelization and Scheduling Strategies for Distributed Machine Learning

Abstract

Distributed machine learning has typically been approached from a data parallel perspective, where big data are partitioned to multiple workers and an algorithm is executed concurrently over different data subsets under various synchronization schemes to ensure speed-up and/or correctness. A sibling problem that has received relatively less attention is how to ensure efficient and correct model parallel execution of ML algorithms, where parameters of an ML program are partitioned to different workers and undergone concurrent iterative updates. We argue that model and data parallelisms impose rather different challenges for system design, algorithmic adjustment, and theoretical analysis. In this paper, we develop a system for model-parallelism, STRADS, that provides a programming abstraction for scheduling parameter updates by discovering and leveraging changing structural properties of ML programs. STRADS enables a flexible tradeoff between scheduling efficiency and fidelity to intrinsic dependencies within the models, and improves memory efficiency of distributed ML. We demonstrate the efficacy of model-parallel algorithms implemented on STRADS versus popular implementations for topic modeling, matrix factorization, and Lasso.

🧭 Keyword Pioneer — model parallelization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — topic modeling

Authors

Seunghak Lee , Jin Kyu Kim , Xun Zheng , Qirong Ho , Garth A Gibson , Eric P. Xing , Eric P Xing

Topics

Machine Learning > Optimization & Theory > Distributed Learning Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Efficient Computing Computer Science > Systems > Distributed Systems Deep Learning > Optimization & Theory > Efficient Computing

Keywords

stochastic optimization online learning matrix factorization topic modeling distributed learning model parallelism distributed machine learning model parallelization scheduling strategies parameter update distributed system parameter scheduling

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014