CFRLA-Net: A Context-Aware Feature Representation Learning Anchor-Free Network for Pedestrian Detection - Details

author：

Li, Jun (Li, Jun.) ^[1] | Bi, Yuquan (Bi, Yuquan.) ^[2] | Wang, Sumei (Wang, Sumei.) ^[3] | Li, Qiming (Li, Qiming.) ^[4]

Indexed by：

EI Scopus SCIE

Abstract：

High　resolution　and　strong　semantic　representation　are　both　vital　for　feature　extraction　networks　of　pedestrian　detection.　The　existing　high-resolution　network　(HRNet)　has　presented　a　promising　performance　for　pedestrian　detection.　However,　we　observed　that　it　still　has　some　significant　shortcomings　for　heavily　occluded　and　small-scale　pedestrians.　In　this　paper,　we　propose　to　address　the　shortcomings　by　extracting　semantic　and　spatial　context　from　HRNet.　Specifically,　we　propose　a　Context-aware　Feature　Representation　Learning　Module　(CFRL-Module),　which　combines　a　Multi-scale　Feature　Context　Extraction　Parallel　Block　for　Convolution　and　Self-attention　(CEPCA-Block)　with　two　parallel　paths　and　an　Equivalent　FFN　(EFFN)　Block.　The　core　CEPCA-Block　adopts　a　parallel　design　to　integrate　convolution　and　multi-head　self-attention　(MHSA)　with　low　parameter　computational　cost,　which　can　obtain　the　deep　semantic　context　by　convolution　path　and　precise　context　by　MHSA　path.　Furthermore,　to　overcome　the　inefficiency　of　global　MHSA　in　high-resolution　pedestrian　detection,　we　propose　a　novel　local　window　MHSA,　which　can　significantly　reduce　memory　consumption　but　barely　affect　the　detection　performance.　Cascading　the　proposed　CFRL-Module　with　the　anchor-free　detection　head　constitutes　our　Context-aware　Feature　Representation　Learning　Anchor-Free　Network　(CFRLA-Net).　The　proposed　CFRLA-Net　can　catch　a　high-level　understanding　of　the　heavily　occluded　and　small-scale　pedestrian　instances　based　on　HRNet,　which　can　effectively　solve　the　limitation　of　the　insufficient　feature　extraction　ability　of　HRNet　for　the　hard　samples.　Experimental　results　show　that　CFRLA-Net　achieves　state-of-the-art　performance　on　CityPersons,　Caltech,　and　CrowdHuman　benchmarks.

Keyword：

anchor-free context HRNet occluded and small-scale pedestrians Pedestrian detection self-attention

Community：

[ 1 ] [Li, Jun]Fuzhou Univ, Dept Adv Mfg, Quanzhou 362200, Fujian, Peoples R China
[ 2 ] [Bi, Yuquan]Fuzhou Univ, Dept Adv Mfg, Quanzhou 362200, Fujian, Peoples R China
[ 3 ] [Li, Qiming]Fuzhou Univ, Dept Adv Mfg, Quanzhou 362200, Fujian, Peoples R China
[ 4 ] [Li, Jun]Chinese Acad Sci, Quanzhou Inst Equipment Mfg, Haixi Inst, Lab Robot & Intelligent Syst, Quanzhou 362216, Fujian, Peoples R China
[ 5 ] [Li, Qiming]Chinese Acad Sci, Quanzhou Inst Equipment Mfg, Haixi Inst, Lab Robot & Intelligent Syst, Quanzhou 362216, Fujian, Peoples R China
[ 6 ] [Wang, Sumei]Hong Kong Polytech Univ, Dept Civil & Environm Engn, Hong Kong, Peoples R China