Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball

Elliott, A. , Law, S. and Russell, C. (2021) Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11-17 Oct 2021, pp. 10693-10702.

[img] Text
253950.pdf - Accepted Version

11MB

Publisher's URL: https://openaccess.thecvf.com/ICCV2021

Abstract

We present a simple regularization of adversarial perturbations based upon the perceptual loss. While the resulting perturbations remain imperceptible to the human eye, they differ from existing adversarial perturbations in that they are semi-sparse alterations that highlight objects and regions of interest while leaving the background unaltered. As a semantically meaningful adverse perturbations, it forms a bridge between counterfactual explanations and adversarial perturbations in the space of images. We evaluate our approach on several standard explainability benchmarks, namely, weak localization, insertiondeletion, and the pointing game demonstrating that perceptually regularized counterfactuals are an effective explanation for image-based classifiers.

Item Type:Conference Proceedings
Additional Information:This work was supported by the Omidya Group and The Alan Turing Institute under the UK Engineering and Physical Sciences Research Council (EPSRC) grant no. EP/N510129/1 and Accenture Plc.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Elliott, Dr Andrew
Authors: Elliott, A., Law, S., and Russell, C.
College/School:College of Science and Engineering > School of Mathematics and Statistics > Statistics
Copyright Holders:Copyright © 2021 IEEE
First Published:First published in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 10693-10702
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record