V2VSR: Keypoints Feature Fusion-Based Cooperative Perception Method Under Communication Delays

Authors: Junxi Chen, Xiangfeng Luo, Liyan Ma, and Xue Chen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 3509-3526
Keywords: Cooperative perception, 3D object detection, Sparse-Residual convolution, Key Feature Selection, Transformer

Abstract

Single-agent LiDAR-based perception has significantly progressed but remains constrained by factors such as sensor range and occlusion. Multi-agent point clouds cooperative perception leverages inter-agent communication to share sensory information, thereby enhancing the perception capabilities of individual agents. Existing methods often assume ideal communication conditions. However, In the real world, data transmission delays are inevitable, which can cause the central agent to receive inaccurate features, leading to significant misguidance in perception results. This paper proposes a novel framework for multi-agent point clouds cooperative perception to efficiently extract key features and reduce latency. Specifically, We introduce a Sparse-Residual PointPillar SRPP backbone, improving inference speed and receptive field, and a Pillar Set Abstract Module PSM , which abstracts scenes into compact keypoint features, significantly reducing shared feature map size. Additionally, we employ an inter-agent attention module, leveraging the characteristic of the main agent's own feature map, which requires no transmission and thus has no latency, to correct potential feature distortions and mitigate the impact of partially unavoidable delays, thereby improving system robustness. Our method can significantly reduce the shared feature map size to less than 0.1 MB, approximately 40 times smaller than most state-of-the-art methods. Even with significantly reduced shared feature maps, our model still outperforms other methods under ideal communication conditions and demonstrates a substantial advantage under delayed communication scenarios, indicating that our method significantly enhances the perception system's performance and delay robustness.
📄 View Full Paper (PDF) 📋 Show Citation