Using Psychophysical Methods to Understand Mechanisms of Face Identification in a Deep Neural Network

Xu, T., Garrod, O., Scholte, S. H., Ince, R. and Schyns, P. G. (2018) Using Psychophysical Methods to Understand Mechanisms of Face Identification in a Deep Neural Network. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, June 18-22, 2018, pp. 2057-2065. ISBN 9781538661000 (doi: 10.1109/CVPRW.2018.00266)

160598.pdf - Accepted Version



Deep Convolutional Neural Networks (CNNs) have been one of the most influential recent developments in computer vision, particularly for categorization [20]. The promise of CNNs is at least two-fold. First, they represent the best engineering solution to successfully tackle the foundational task of visual categorization with a performance level that even exceeds that of humans [19, 27]. Second, for computational neuroscience, CNNs provide a testable modelling platform for visual categorizations inspired by the multi-layered organization of visual cortex [7]. Here, we used a 3D generative model to control the variance of information learned to identify 2,000 face identities in one CNN architecture (10-layer ResNet [9]). We generated 25M face images to train the network by randomly sampling intrinsic (i.e. face morphology, gender, age, expression and ethnicity) and extrinsic factors of face variance (i.e. 3D pose, illumination, scale and 2D translation). At testing, the network performed with 99% generalization accuracy for face identity across variations of intrinsic and extrinsic factors. State-of-the-art information mapping techniques from psychophysics (i.e. Representational Similarity Analysis [18] and Bubbles [8]) revealed respectively the network layer at which factors of variance are resolved and the face features that are used for identity. By explicitly controlling the generative factors of face information, we provide an alternative framework based on human psychophysics to understand information processing in CNNs.

Item Type:Conference Proceedings
Additional Information:PGS is funded by the Wellcome Trust (107802/Z/15/Z) and the Multidisciplinary University Research Initiative (MURI) / Engineering and Physical Sciences Research Council (EP/N019261/1).
Glasgow Author(s) Enlighten ID:Garrod, Dr Oliver and Ince, Dr Robin and Xu, Dr Tian and Schyns, Professor Philippe
Authors: Xu, T., Garrod, O., Scholte, S. H., Ince, R., and Schyns, P. G.
College/School:College of Medical Veterinary and Life Sciences > School of Psychology & Neuroscience
Published Online:17 December 2018
Copyright Holders:Copyright © 2018 IEEE
First Published:First published in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW): 2057-2065
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
698281Brain Algorithmics: Reverse Engineering Dynamic Information Processing Networks from MEG time seriesPhilippe SchynsWellcome Trust (WELLCOTR)107802/Z/15/ZINP - CENTRE FOR COGNITIVE NEUROIMAGING
700661Visual Commonsense for Scene UnderstandingPhilippe SchynsEngineering and Physical Sciences Research Council (EPSRC)EP/N019261/1INP - CENTRE FOR COGNITIVE NEUROIMAGING