← Optimization & Theory

Deep Learning › Optimization & Theory ›

Stochastic Methods

97 directly classified papers

Papers per year

Papers

On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning AAAI 2020

Bundle Adjustment on a Graph Processor CVPR 2020

An Investigation Into the Stochasticity of Batch Whitening CVPR 2020

Gradient Estimation with Stochastic Softmax Tricks NIPS 2020

Minibatch vs Local SGD for Heterogeneous Distributed Learning NIPS 2020

Faster Differentially Private Samplers via Rényi Divergence Analysis of Discretized Langevin MCMC NIPS 2020

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations NIPS 2020

Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning AAAI 2019

Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy AAAI 2019

Communication-Efficient Stochastic Gradient MCMC for Neural Networks AAAI 2019

Making Asynchronous Stochastic Gradient Descent Work for Transformers EMNLP 2019

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation NIPS 2019

Communication trade-offs for Local-SGD with large step size NIPS 2019

Reducing Noise in GAN Training with Variance Reduced Extragradient NIPS 2019

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond NIPS 2019

Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback NIPS 2019

Momentum-Based Variance Reduction in Non-Convex SGD NIPS 2019

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations NIPS 2019

Sampled Softmax with Random Fourier Features NIPS 2019

Training Deep Models Faster with Robust, Approximate Importance Sampling NIPS 2018

How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD NIPS 2018

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization NIPS 2018

LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning NIPS 2018

Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation EMNLP 2018

Statistical Tomography of Microscopic Life CVPR 2018