Label Smoothing and Adversarial Robustness

Recent studies indicate that current adversarial attack methods are flawed and easy to fail when encountering some deliberately designed defense. Sometimes even a slight modification in the model details will invalidate the attack.We find that training model with label smoothing can easily achieve striking accuracy under most gradient-based attacks. For instance, the robust accuracy of a WideResNet model trained with label smoothing on CIFAR-10 achieves 75% at most under PGD attack. To understand the reason underlying the subtle robustness, we investigate the relationship between label smoothing and adversarial robustness. Through theoretical analysis about the characteristics of the network trained with label smoothing and experiment verification of its performance under various attacks. We demonstrate that the robustness produced by label smoothing is incomplete based on the fact that its defense effect is volatile, and it cannot defend attacks transferred from a naturally trained model. Our study enlightens the research community to rethink how to evaluate the model's robustness appropriately.

标签平滑和对抗鲁棒性

最近的研究表明,当前的对抗攻击方法在遇到一些故意设计的防御时存在缺陷,并且容易失败。有时,即使对模型详细信息进行少量修改也会使攻击无效。.. 我们发现,带有标签平滑的训练模型可以在大多数基于梯度的攻击下轻松实现打击精度。例如,在CIFAR-10上使用标签平滑训练的WideResNet模型的鲁棒准确性在PGD攻击下最多可达到75%。为了了解潜在的鲁棒性的原因,我们研究了标签平滑和对抗性鲁棒性之间的关系。通过对通过标签平滑训练的网络的特性进行理论分析,并在各种攻击下对其性能进行实验验证。我们证明了基于标签平滑产生的健壮性是不完全的,这是基于其防御效果易变的事实,并且它无法防御从自然训练的模型传递来的攻击。我们的研究启发研究界重新思考如何评估模型 (阅读更多)