Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates
Annotating training data for sequence tagging tasks is usually very time-consuming. Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.We are the first to thoroughly investigate this powerful combination in sequence tagging. We find that taggers based on deep pre-trained models can benefit from Bayesian query strategies with the help of the Monte Carlo (MC) dropout. Results of experiments with various uncertainty estimates and MC dropout variants show that the Bayesian active learning by disagreement query strategy coupled with the MC dropout applied only in the classification layer of a Transformer-based tagger is the best option in terms of quality. This option also has very little computational overhead. We also demonstrate that it is possible to reduce the computational overhead of AL by using a smaller distilled version of a Transformer model for acquiring instances during AL.
使用深度预训练模型和贝叶斯不确定性估计进行序列标记的主动学习
为序列标记任务注释训练数据通常非常耗时。与自然学习相结合的自然语言处理迁移学习的最新进展为显着减少必要的注释预算提供了可能。.. 我们是第一个彻底研究序列标记中这种强大组合的人。我们发现,基于深度预训练模型的标记者可以借助蒙特卡洛(MC)辍学的贝叶斯查询策略而受益。具有各种不确定性估计和MC丢失变体的实验结果表明,就质量而言,贝叶斯主动学习通过分歧查询策略与仅在基于Transformer的标记器的分类层中应用的MC丢失相结合是最佳选择。此选项还具有很少的计算开销。我们还演示了通过使用较小的Transformer模型的精简版本在AL期间获取实例来减少AL的计算开销是可能的。 (阅读更多)
暂无评论