Group Whitening: Balancing Learning Efficiency and Representational Capacity
Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. The merits of BN in improving a model's learning efficiency can be further amplified by applying whitening, while its drawbacks in estimating population statistics for inference can be avoided through group normalization (GN).This paper proposes group whitening (GW), which exploits the advantages of the whitening operation and avoids the disadvantages of normalization within mini-batches. In addition, we analyze the constraints imposed on features by normalization, and show how the batch size (group number) affects the performance of batch (group) normalized networks, from the perspective of model's representational capacity. This analysis provides theoretical guidance for applying GW in practice. Finally, we apply the proposed GW to ResNet and ResNeXt architectures and conduct experiments on the ImageNet and COCO benchmarks. Results show that GW consistently improves the performance of different architectures, with absolute gains of $1.02\%$ $\sim$ $1.49\%$ in top-1 accuracy on ImageNet and $1.82\%$ $\sim$ $3.21\%$ in bounding box AP on COCO.
团体美白:平衡学习效率和代表能力
批次标准化(BN)是通常纳入深度学习模型中以在小型批次内执行标准化的一项重要技术。通过应用白化,可以进一步放大BN在提高模型学习效率方面的优点,而可以通过组归一化(GN)避免其在估计总体统计量以进行推理方面的缺点。.. 本文提出了群组美白(GW),它利用了美白操作的优点,避免了小批处理中进行规范化的缺点。此外,我们分析了归一化对特征施加的约束,并从模型的表示能力的角度显示了批次大小(组数)如何影响批次(组)归一化网络的性能。该分析为在实践中应用GW提供了理论指导。最后,我们将建议的GW应用于ResNet和ResNeXt架构,并在ImageNet和COCO基准上进行实验。结果表明,GW不断提高不同架构的性能,绝对收益为 1.02% 〜 1.49% 在ImageNet上的top-1精度和 1.82% 〜 3.21% 在COCO的边界框AP中。 (阅读更多)
暂无评论