Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers

Huang, Z., Dai, H., Xiang, T.-Z., Wang, S., Chen, H.-X., Qin, J. and Xiong, H. (2023) Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers. In: IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023), Vancouver, Canada, 18-22 June 2023, pp. 5557-5566. ISBN 9798350301298 (doi: 10.1109/CVPR52729.2023.00538)

[img] Text
296471.pdf - Accepted Version

2MB

Abstract

Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camou-flaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features through progressive shrinking for camou-flaged object detection. Specifically, we propose a non-local token enhancement module (NL-TEM) that employs the non-local mechanism to interact neighboring tokens and explore graph-based high-order relations within tokens to enhance local representations of transformers. Moreover, we design a feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which progressively aggregates adjacent transformer features through a layer-by-layer shrinkage pyramid to accumulate imperceptible but effective cues as much as possible for object information decoding. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing 24 competitors on three challenging COD benchmark datasets under six widely-used evaluation metrics. Our code is publicly available at https://github.com/ZhouHuang23/FSPNet.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Dai, Dr Hang
Authors: Huang, Z., Dai, H., Xiang, T.-Z., Wang, S., Chen, H.-X., Qin, J., and Xiong, H.
College/School:College of Science and Engineering > School of Computing Science
ISSN:2575-7075
ISBN:9798350301298
Copyright Holders:Copyright © 2023 IEEE
First Published:First published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record