HOGWILD演算法
Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-artperformance on a variety of machine learning tasks. Several researchers have recently pro-posed schemes to parallelize SGD, but all require performance-destroying memory locking andsynchronization. This work aims to show using novel theoretical analysis, algorithms, and im-plementation that SGD can be implemented without any locking. We present an update schemecalled Hogwild! which allows processors access to shared memory with the possibility of over-writing each other’s work. We show that when the associated optimization problem is sparse,meaning most gradient updates only modify small parts of the decision variable, then Hogwild!achieves a nearly optimal rate of convergence. We demonstrate experimentally that Hogwild!outperforms alternative schemes that use locking by an order of magnitude.
暂无评论