End-to-end novel visual categories learning via auxiliary self-supervision

Qing, Y., Zeng, Y., Cao, Q. and Huang, G.-B. (2021) End-to-end novel visual categories learning via auxiliary self-supervision. Neural Networks, 139, pp. 24-32. (doi: 10.1016/j.neunet.2021.02.015)

[img] Text
235959.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

1MB

Abstract

Semi-supervised learning has largely alleviated the strong demand for large amount of annotations in deep learning. However, most of the methods have adopted a common assumption that there is always labeled data from the same class of unlabeled data, which is impractical and restricted for real-world applications. In this research work, our focus is on semi-supervised learning when the categories of unlabeled data and labeled data are disjoint from each other. The main challenge is how to effectively leverage knowledge in labeled data to unlabeled data when they are independent from each other, and not belonging to the same categories. Previous state-of-the-art methods have proposed to construct pairwise similarity pseudo labels as supervising signals. However, two issues are commonly inherent in these methods: (1) All of previous methods are comprised of multiple training phases, which makes it difficult to train the model in an end-to-end fashion. (2) Strong dependence on the quality of pairwise similarity pseudo labels limits the performance as pseudo labels are vulnerable to noise and bias. Therefore, we propose to exploit the use of self-supervision as auxiliary task during model training such that labeled data and unlabeled data will share the same set of surrogate labels and overall supervising signals can have strong regularization. By doing so, all modules in the proposed algorithm can be trained simultaneously, which will boost the learning capability as end-to-end learning can be achieved. Moreover, we propose to utilize local structure information in feature space during pairwise pseudo label construction, as local properties are more robust to noise. Extensive experiments have been conducted on three frequently used visual datasets, i.e., CIFAR-10, CIFAR-100 and SVHN, in this paper. Experiment results have indicated the effectiveness of our proposed algorithm as we have achieved new state-of-the-art performance for novel visual categories learning for these three datasets.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Cao, Dr Qi
Authors: Qing, Y., Zeng, Y., Cao, Q., and Huang, G.-B.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:Neural Networks
Publisher:Elsevier
ISSN:0893-6080
ISSN (Online):1879-2782
Published Online:23 February 2021
Copyright Holders:Copyright © 2021 Elsevier
First Published:First published in Neural Networks 139:24-32
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record