用于神经网络量化的镜像下降视图 Quantizing large Neural Networks (NN) while maintaining the performance is highly desirable for resource-limited devices due to reduced memory and tim