Dynamically Throttleable Neural Networks (TNN)

Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources.We designed TNN with several properties that enable more flexibility for dynamic execution based on runtime context. TNNs are defined as throttleable modules gated with a separately trained controller that generates a single utilization control parameter. We validate our proposal on a number of experiments, including Convolution Neural Networks (CNNs such as VGG, ResNet, ResNeXt, DenseNet) using CiFAR-10 and ImageNet dataset, for object classification and recognition tasks. We also demonstrate the effectiveness of dynamic TNN execution on a 3D Convolustion Network (C3D) for a hand gesture task. Results show that TNN can maintain peak accuracy performance compared to vanilla solutions, while providing a graceful reduction in computational requirement, down to 74% reduction in latency and 52% energy savings.

可动态调节的神经网络(TNN)

深度神经网络(DNN)的条件计算通过运行网络的子集来减少总体计算负荷并提高模型准确性。在这项工作中,我们提出了一个运行时可调节的神经网络(TNN),它可以自适应地自我调节其自身的性能目标和计算资源。.. 我们设计了具有多个属性的TNN,这些属性为基于运行时上下文的动态执行提供了更大的灵活性。TNN被定义为由单独训练的控制器选通的可节流模块,该控制器生成单个利用率控制参数。我们在许多实验中验证了我们的建议,包括使用CiFAR-10和ImageNet数据集的卷积神经网络(CNN,例如VGG,ResNet,ResNeXt,DenseNet),用于对象分类和识别任务。我们还演示了在3D卷积网络(C3D)上执行手势任务时动态TNN执行的有效性。结果表明,与原始解决方案相比,TNN可以保持最高的准确性,同时可以合理地减少计算需求,延迟降低74%,节能52%。 (阅读更多)