本文着眼于视觉显着性预测的问题,在受限的计算预算下,预测易于吸引人类视觉注意力的图像区域。我们修改并测试了各种最新的高效卷积神经网络体系结构(例如EfficientNet和MobileNetV2),并将它们与现有的最新显着性模型(例如SalGAN和DeepGaze II)进行了比较,无论是标准精度指标(例如AUC和NSS)还是术语计算复杂度和模型大小。..
FastSal: a Computationally Efficient Network for Visual Saliency Prediction
This paper focuses on the problem of visual saliency prediction, predicting regions of an image that tend to attract human visual attention, under a constrained computational budget. We modify and test various recent efficient convolutional neural network architectures like EfficientNet and MobileNetV2 and compare them with existing state-of-the-art saliency models such as SalGAN and DeepGaze II both in terms of standard accuracy metrics like AUC and NSS, and in terms of the computational complexity and model size.We find that MobileNetV2 makes an excellent backbone for a visual saliency model and can be effective even without a complex decoder. We also show that knowledge transfer from a more computationally expensive model like DeepGaze II can be achieved via pseudo-labelling an unlabelled dataset, and that this approach gives result on-par with many state-of-the-art algorithms with a fraction of the computational cost and model size. Source code is available at https://github.com/feiyanhu/FastSal.
暂无评论