VoxT-GNN: A 3D object detection approach from point cloud based on voxel-level transformer and graph neural network - Details

author：

Zheng, Q. (Zheng, Q..) ^[1] | Wu, S. (Wu, S..) ^[2] | Wei, J. (Wei, J..) ^[3]

Indexed by：

Scopus

Abstract：

Recently,　a　variety　of　LiDAR-based　methods　for　the　3D　detection　of　single-class　objects,　large　objects,　or　in　straightforward　scenes　have　exhibited　competitive　performance.　However,　their　detection　performance　in　complex　scenarios　with　multi　-　sized　and　multi　-　class　objects　is　limited.　We　observe　that　the　core　problem　leading　to　this　phenomenon　is　the　insufficient　feature　learning　of　small　objects　in　point　clouds,　making　it　difficult　to　obtain　more　discriminative　features.　To　address　this　challenge,　we　propose　a　3D　object　detection　framework　based　on　point　clouds　that　takes　into　account　the　detection　of　small　objects,　termed　VoxT-GNN.　The　framework　comprises　two　core　components:　a　Voxel-Level　Transformer　(VoxelFormer)　for　local　feature　learning　and　a　Graph　Neural　Network　Feed-Forward　Network　(GnnFFN)　for　global　feature　learning.　By　embedding　GnnFFN　as　an　intermediate　layer　between　the　encoder　and　decoder　of　VoxelFormer,　we　achieve　flexible　scaling　of　the　global　receptive　field　while　maximally　preserving　the　original　point　cloud　structure.　This　design　enables　effective　adaptation　to　objects　of　varying　sizes　and　categories,　providing　a　viable　solution　for　detection　applications　across　diverse　scenarios.　Extensive　experiments　on　KITTI　and　Waymo　Open　Dataset　(WOD)　demonstrate　the　strong　competitiveness　of　our　method,　particularly　showing　significant　improvements　in　small　object　detection.　Notably,　our　approach　achieves　the　second-highest　mAP　of　65.44%　across　three　categories　(car,　pedestrian,　and　cyclist)　on　KITTI　benchmark.　The　source　code　is　available　at　https://github.com/yujianxinnian/VoxT-GNN.　©　2025　The　Author(s)

Keyword：

3D object detection Graph Neural Network(GNN) Point cloud Transformer

Community：

[ 1 ] [Zheng Q.]The College of Computer and Data Science, Fuzhou University, China
[ 2 ] [Wu S.]The Academy of Digital China (Fujian), Fuzhou University, China
[ 3 ] [Wei J.]The College of Computer and Data Science, Fuzhou University, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

VoxT-GNN: A 3D object detection approach from point cloud based on voxel-level transformer and graph neural network
2025，INFORMATION PROCESSING & MANAGEMENT
VoxTNT: A Multi-Scale Transformer-based Approach for 3D Object Detection in Point Clouds; [VoxTNT:基于多尺度 Transformer 的点云 3D 目标检测方法]
2025，Journal of Geo-Information Science
Multi-Level Rotational Equivariant Object Detection Network Based on BEV Fusion
2024，Computer Engineering
IRBEVF-Q: Optimization of Image-Radar Fusion Algorithm Based on Bird's Eye View Features
2024，SENSORS
DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving
2025，IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

Source ：

Information Processing and Management

ISSN： 0306-4573

Year： 2025

Issue： 4

Volume： 62

7 . 4 0 0

JCR@2023

CAS Journal Grade：1

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to