FixNorm：剖析体重衰减，以训练深度神经网络

qqvisual75478 8 0 .pdf 2021-01-22 03:01:48

权重衰减是训练深度神经网络（DNN）的一种广泛使用的技术。它极大地影响了泛化性能，但是其底层机制尚未完全被理解。..

FixNorm: Dissecting Weight Decay for Training Deep Neural Networks

Weight decay is a widely used technique for training Deep Neural Networks(DNN). It greatly affects generalization performance, but the underlying mechanisms are not fully understood.Recent works show that for layers followed by normalizations, weight decay mainly affects the \emph{effective learning rate}. However, although normalizations have been extensively adopted in modern DNNs, layers such as the final fully-connected layer do not satisfy this precondition. For these layers, the effects of weight decay are still unclear. In this paper, we comprehensively investigate the mechanisms of weight decay and find that except for influencing effective learning rate, weight decay has another distinct mechanism that is equally important: affecting generalization performance by controlling \emph{cross-boundary risk}. These two mechanisms together give a more comprehensive explanation for the effects of weight decay. Based on this discovery, we propose a new training method called \textbf{FixNorm}, which discards weight decay and directly controls the two mechanisms. We also propose a practical method to tune hyperparameters of FixNorm, finding near-optimal solutions 2$\sim$3 times faster than Bayesian Optimization. On ImageNet classification task, training EfficientNet-B0 with FixNorm achieves 77.7\%, which outperforms the original baseline by a clear margin. Surprisingly, when scaling MobileNetV2 to the same FLOPS and applying the same tricks with EfficientNet-B0, training with FixNorm achieves 77.4\%, which shows the importance of well-tuned training procedures and further verifies the effectiveness of our approach. We set up more well-tuned baselines using FixNorm, to facilitate fair comparisons in the community.

用户评论

暂无评论

训练神经网络的方法分享.docx

许多人都亲身经历了“卷积层是如何工作的”和“我们的convnet实现了最先进的结果”之间的巨大差距。与其列举更常见的错误或充实它们,不如更深入一点,谈谈如何避免这些错误(或者非常快速地修复它们)。这样

13 2020-08-09
神经网络训练反馈源代码

MATLAB中的神经网络训练反馈源代码nn_train_feedback.m

25 2019-01-09
akkordeon使用akka训练神经网络源码

akkordeon:使用akka训练神经网络

7 2021-02-08
深度学习神经网络BP神经网络原理推导及python实现

深度学习(神经网络) —— BP神经网络原理推导及python实现摘要(一)BP神经网络简介1、神经网络权值调整的一般形式为:2、BP神经网络中关于学习信号的求取方法:(二)BP神经网络原理推导1、变

22 2020-12-31
神经网络可定制的深度神经网络的简单实现源码

神经网络:可定制的深度神经网络的简单实现

14 2021-02-24
43个案例深度剖析MATLAB神经网络与LIBSVM参数应用

深入研究MATLAB神经网络与LIBSVM参数的43个案例，详细剖析Elman神经网络在数据预测中的实例。本文聚焦电力负荷预测模型，通过具体案例分析，解读MATLAB神经网络的应用与LIBSVM参数的

68 2023-12-19
基于TensorFlow的深度学习深度增强学习代码NN传统神经网络CNN卷积神经网络RNN递归神经网络LS

卷积神经网络基于TensorFlow的深度学习深度增强学习代码NN传统神经网络CNN卷积神经网络RNN递归神经网络LSTM长短期记忆网络GAN生成对抗网络DRL深度增强学习

10 2023-02-12
卷积神经网络以常用网络架构介绍

本文档详细介绍了卷积神经网络的工作机理,然后介绍了常用的AlexNet、VGG、GoogLet、ResNet网络架构以及论文中介绍的一些核心trick,非常值得深度学习爱好者作为入门材料来研究。

13 2020-08-22
神经网络循环神经网络

邱老师的资料，针对循环神经网络进行了详细的讲解，相当棒！

67 2018-12-25
直接量化用于训练高精度低位宽的深度神经网络

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-wi

11 2021-01-22

FixNorm：剖析体重衰减，以训练深度神经网络

用户评论

推荐下载