Data Compression in Distributed Learning

Applied and Computational Mathematics Seminar
Monday, November 15, 2021 - 2:00pm for 1 hour (actually 50 minutes)
Ming Yan – Michigan State University – myan@msu.edu
Wenjing Liao

Large-scale machine learning models are trained by parallel (stochastic) gradient descent algorithms on distributed systems. The communications for gradient aggregation and model synchronization become the major obstacles for efficient learning as the number of nodes and the model's dimension scale up. In this talk, I will introduce several ways to compress the transferred data and reduce the overall communication such that the obstacles can be immensely mitigated. More specifically, I will introduce methods to reduce or eliminate the compression error without additional communication.