V2VFormer++: Multi-Modal Vehicle-to-Vehicle Cooperative Perception via Global-Local Transformer

Yin, H., Tian, D., Lin, C., Duan, X., Zhou, J., Zhao, D. and Cao, D. (2023) V2VFormer++: Multi-Modal Vehicle-to-Vehicle Cooperative Perception via Global-Local Transformer. IEEE Transactions on Intelligent Transportation Systems, 25(2), pp. 2153-2166. (doi: 10.1109/TITS.2023.3314919)

[img] Text
307651.pdf - Accepted Version

5MB

Abstract

Multi-vehicle cooperative perception has recently emerged for facilitating long-range and large-scale perception ability of connected automated vehicles (CAVs). Nonetheless, enormous efforts formulate collaborative perception as LiDAR-only 3D detection paradigm, neglecting the significance and complementary of dense image. In this work, we construct the first multi-modal vehicle-to-vehicle cooperative perception framework dubbed as V2VFormer++, where individual camera-LiDAR representation is incorporated with dynamic channel fusion (DCF) at bird’s-eye-view (BEV) space and ego-centric BEV maps from adjacent vehicles are aggregated by global-local transformer module. Specifically, channel-token mixer (CTM) with MLP design is developed to capture global response among neighboring CAVs, and position-aware fusion (PAF) further investigate the spatial correlation between each ego-networked map in a local perspective. In this manner, we could strategically determine which CAVs are desirable for collaboration and how to aggregate the foremost information from them. Quantitative and qualitative experiments are conducted on both publicly-available OPV2V and V2X-Sim 2.0 benchmarks, and our proposed V2VFormer++ reports the state-of-the-art cooperative perception performance, demonstrating its effectiveness and advancement. Moreover, ablation study and visualization analysis further suggest the strong robustness against diverse disturbances from real-world scenarios.

Item Type:Articles
Additional Information:This work was supported in part by the National Key Research and Development Program of China under Grant 2022YFC3803700; and in part by the National Natural Science Foundation of China under Grant U20A20155, Grant 62173012, and Grant 52202391.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Zhao, Dr Dezong
Authors: Yin, H., Tian, D., Lin, C., Duan, X., Zhou, J., Zhao, D., and Cao, D.
College/School:College of Science and Engineering > School of Engineering > Autonomous Systems and Connectivity
Journal Name:IEEE Transactions on Intelligent Transportation Systems
Publisher:IEEE
ISSN:1524-9050
ISSN (Online):1558-0016
Copyright Holders:Copyright © 2023 IEEE
First Published:First published in IEEE Transactions on Intelligent Transportation Systems 25(2): 2153-2166
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record