Abstract: Distributed deep learning has been widely employed to train deep neural network over large-scale dataset. However, the commonly used parameter server architecture suffers from long ...