• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Chen, Danya (Chen, Danya.) [1] | Wu, Lijun (Wu, Lijun.) [2] (Scholars:吴丽君) | Chen, Zhicong (Chen, Zhicong.) [3] (Scholars:陈志聪) | Lin, Xufeng (Lin, Xufeng.) [4]

Indexed by:

CPCI-S EI Scopus

Abstract:

Recently, CNN-Transformer hybrid network has been proposed to resolve either the heavy computational burden of CNN or the difficulty encountered during training the Transformer-based networks. In this work, we design an efficient and effective CNN-Transformer hybrid network for human pose estimation, namely CTHPose. Specifically, Polarized CNN Module is employed to extract the feature with plentiful visual semantic clues, which is beneficial for the convergence of the subsequent Transformer encoders. Pyramid Transformer Module is utilized to build the long-term relationship between human body parts with lightweight structure and less computational complexity. To establish long-term relationship, large field of view is necessary in Transformer, which leads to a large computational workload. Hence, instead of the entire feature map, we introduced a reorganized small sliding window to provide the required large field of view. Finally, Heatmap Generator is designed to reconstruct the 2D heatmaps from the 1D keypoint representation, which balances parameters and FLOPs while obtaining accurate prediction. According to quantitative comparison experiments with CNN estimators, CTHPose significantly reduces the number of network parameters and GFLOPs, while also providing better detection accuracy. Compared with mainstream pure Transformer networks and state-of-the-art CNN-Transformer hybrid networks, this network also has competitive performance, and is more robust to the clothing pattern interference and overlapping limbs from the visual perspective.

Keyword:

Human pose estimation Long-range dependency Transformer

Community:

  • [ 1 ] [Chen, Danya]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China
  • [ 2 ] [Wu, Lijun]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China
  • [ 3 ] [Chen, Zhicong]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China
  • [ 4 ] [Lin, Xufeng]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China

Reprint 's Address:

  • [Wu, Lijun]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China;;

Show more details

Version:

Related Keywords:

Source :

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V

ISSN: 0302-9743

Year: 2024

Volume: 14429

Page: 327-339

0 . 4 0 2

JCR@2005

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 1

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:455/10362371
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1