Variance Based Sample Weighting for Supervised Learning

In the context of supervised learning of a function by a Neural Network (NN), we claim and empirically justify that a NN yields better results when the distribution of the data set focuses on regions where the function to learn is steeper. We first traduce this assumption in a mathematically workable way using Taylor expansion.Then, theoretical derivations allow to construct a methodology that we call Variance Based Samples Weighting (VBSW). VBSW uses local variance of the labels to weight the training points. This methodology is general, scalable, cost effective, and significantly increases the performances of a large class of models for various classification and regression tasks on image, text and multivariate data. We highlight its benefits with experiments involving NNs from shallow linear NN to ResNet or Bert.

基于方差的样本加权用于监督学习

在通过神经网络(NN)监督学习功能的情况下,我们主张并凭经验证明当数据集的分布集中在学习功能较陡的区域时,NN会产生更好的结果。我们首先使用泰勒展开法以数学上可行的方式推论这个假设。.. 然后,理论推导允许构建一种我们称为基于方差的样本加权(VBSW)的方法。VBSW使用标签的局部差异来加权训练点。这种方法是通用的,可伸缩的,具有成本效益的,并且可以极大地提高针对图像,文本和多变量数据的各种分类和回归任务的大型模型的性能。我们通过涉及从浅线性NN到ResNet或Bert的NN的实验来强调其优势。 (阅读更多)