最近的研究提出了对深层模型的成员推断(MI)攻击。尽管此类MI攻击的准确性适中,但我们表明,报告攻击准确性的方式通常会产生误导,而实际上高度不可靠且效率低下的简单盲目攻击通常可以表示相似的准确性。..

Towards the Infeasibility of Membership Inference on Deep Models

Recent studies propose membership inference (MI) attacks on deep models. Despite the moderate accuracy of such MI attacks, we show that the way the attack accuracy is reported is often misleading and a simple blind attack which is highly unreliable and inefficient in reality can often represent similar accuracy.We show that the current MI attack models can only identify the membership of misclassified samples with mediocre accuracy at best, which only constitute a very small portion of training samples. We analyze several new features that have not been explored for membership inference before, including distance to the decision boundary and gradient norms, and conclude that deep models' responses are mostly indistinguishable among train and non-train samples. Moreover, in contrast with general intuition that deeper models have a capacity to memorize training samples, and, hence, they are more vulnerable to membership inference, we find no evidence to support that and in some cases deeper models are often harder to launch membership inference attack on. Furthermore, despite the common belief, we show that overfitting does not necessarily lead to higher degree of membership leakage. We conduct experiments on MNIST, CIFAR-10, CIFAR-100, and ImageNet, using various model architecture, including LeNet, ResNet, DenseNet, InceptionV3, and Xception. Source code: https://github.com/shrezaei/MI-Attack}{\color{blue} {https://github.com/shrezaei/MI-Attack}.