Integrating spatial details with long-range contexts for semantic segmentation of very high resolution remote sensing images - Details

author：

Long, J. (Long, J..) ^[1] | Li, M. (Li, M..) ^[2] | Wang, X. (Wang, X..) ^[3]

Indexed by：

Scopus

Abstract：

This　paper　presents　a　cross-learning　network　(i.e.,　CLCFormer)　integrating　fine-grained　spatial　details　within　long-range　global　contexts　based　upon　convolutional　neural　network　(CNN)　and　transformer,　for　semantic　segmentation　of　very　high-resolution　(VHR)　remote　sensing　images.　More　specifically,　CLCFormer　comprises　two　parallel　encoders,　derived　from　CNN　and　transformer,　and　a　CNN　decoder.　The　encoders　are　backboned　on　SwinV2　and　EfficientNet-B3,　from　which　the　extracted　semantic　features　are　aggregated　at　multiple　levels　using　a　bilateral　feature　fusion　module.　Firstly,　we　used　attention　gate　modules　to　enhance　feature　representation,　improving　segmentation　results　for　objects　with　various　shapes　and　sizes.　Secondly,　we　used　an　attention　residual　module　to　refine　spatial　features’s　learning,　alleviating　boundary　blurring　of　occluded　objects.　Finally,　we　developed　a　new　strategy,　called　auxiliary　supervise　strategy,　for　model　optimization　to　further　improve　segmentation　performance.　Our　method　was　tested　on　the　WHU,　Inria,　and　Potsdam　datasets,　and　compared　with　CNN-based　and　transformer-based　methods.　Results　showed　that　our　method　achieved　state-of-the-art　performance　on　the　WHU　building　dataset　(92.31%IoU),　Inria　building　dataset　(83.71%IoU),　and　Potsdam　dataset　(80.27%MIoU).　We　concluded　that　CLCFormer　is　a　flexible,　robust,　and　effective　method　for　the　semantic　segmentation　of　VHR　images.　The　codes　of　the　proposed　model　are　avaliable　at　https://github.com/long123524/CLCFormer.　IEEE

Keyword：

auxiliary supervise Buildings CLCFormer Convolution Convolutional neural networks convolutional neural networks (CNNs) Feature extraction Semantics Semantic segmentation Tiles transformer Transformers very high-resolution (VHR) images

Community：

[ 1 ] [Long, J.]Key Lab of Spatial Data Mining &
[ 2 ] Information Sharing of Ministry of Education, Academy of Digital China (Fujian), Fuzhou University, China
[ 3 ] [Li, M.]Key Lab of Spatial Data Mining &
[ 4 ] Information Sharing of Ministry of Education, Academy of Digital China (Fujian), Fuzhou University, China
[ 5 ] [Wang, X.]Key Lab of Spatial Data Mining &
[ 6 ] Information Sharing of Ministry of Education, Academy of Digital China (Fujian), Fuzhou University, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

Integrating Spatial Details With Long-Range Contexts for Semantic Segmentation of Very High-Resolution Remote-Sensing Images
2023，IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation
2025，19th Chinese Conference on Image and Graphics Technologies and Applications, IGTA 2024
Building Type Classification Using CNN-Transformer Cross-Encoder Adaptive Learning From Very High Resolution Satellite Images
2025，IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
SegFormer-Based Cotton Planting Areas Extraction from High-Resolution Remote Sensing Images
2023，11th International Conference on Agro-Geoinformatics, Agro-Geoinformatics 2023
Attention-Guided CNN-Transformer Hybrid Network for Hyperspectral Image Classification
2023，7th Asian Conference on Artificial Intelligence Technology, ACAIT 2023

Source ：

IEEE Geoscience and Remote Sensing Letters

ISSN： 1545-598X

Year： 2023

Volume： 20

Page： 1-1

4 . 0

JCR@2023

4 . 0 0 0

JCR@2023

ESI HC Threshold：26

JCR Journal Grade：1

CAS Journal Grade：3

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 26

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to