Indexed by:
Abstract:
To improve the efficiency and accuracy of attention mask prediction for panoptic segmentation on point clouds, an end-to-end point cloud panoptic segmentation network model guided by multi-modal bird’s eye view (BEV) features is proposed. First, object queries are generated from BEV features decoded by the Transformer, and feature enhancement is achieved through confidence ranking and positional encoding embedding. Second, a cross-attention mechanism module is constructed to fuse object queries with learnable query features and the query features of fused object instance information are used to improve the accuracy of attention mask prediction. Finally, the dimensionality of the input features to the masked attention mechanism network is reduced to enhance detection speed. Experimental results based on the nuScenes dataset indicate that, compared with the baseline method, BEVGuide-PS improves panoptic segmentation metrics PQ, PQ† , RQ, and SQ by 17. 7%, 17. 0%, 18. 3%, and 20. 9%, respectively, reduces inference time by 58. 4%, and significantly enhances training efficiency. © 2025 Universitat zu Koln. All rights reserved.
Keyword:
Reprint 's Address:
Email:
Source :
Laser and Optoelectronics Progress
ISSN: 1006-4125
Year: 2025
Issue: 12
Volume: 62
0 . 9 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 3
Affiliated Colleges: