• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Chen, Y. (Chen, Y..) [1] | Zhan, L. (Zhan, L..) [2] | Zhao, Y. (Zhao, Y..) [3] | Xiong, S. (Xiong, S..) [4] | Lu, X. (Lu, X..) [5]

Indexed by:

Scopus

Abstract:

This article introduces a task named visual grounding of remote sensing ship (VGRSS) images. The goal of VGRSS is to locate ship objects in remote sensing images guided by natural language. Extensive research has been conducted on multimodal processing of remote sensing images and text to retrieve rich information from remote sensing images using natural language. However, due to the unique characteristics of remote sensing ship images, ship localization using natural language remains a challenge. Therefore, in this work, we construct datasets for the VGRSS task and explore deep learning models. Specifically, our contributions can be summarized as follows: first, we construct two remote sensing ship datasets for visual grounding. One is based on the optical remote sensing ship target detection benchmark dataset, named RSSVG, while the other is based on the synthetic aperture radar (SAR) dataset, named SARVG. Second, we propose a language-guided visual feature enhancement (LVFE) module. This module enhances visual features through language guidance before visual-linguistic fusion (VLF). Third, we propose a VLF module based on multimodal feature stacking. This module inputs the stacked language and visual features, and then performs feature fusion using a Transformer, enabling effective cross-modal interaction and integration. Fourth, we introduce a novel loss calculation method by incorporating enhanced intersection over union (EIoU) into the loss function. Finally, we benchmark extensive state-of-the-art (SOTA) natural image visual grounding (VG) methods on the constructed RSSVG and SARVG datasets, then provide insightful analysis based on the results. This work offers valuable insights for developing better VGRSS models. © 1980-2012 IEEE.

Keyword:

Language-guided visual feature enhancement (LVFE) Transformer VG of remote sensing ship (VGRSS) images visual grounding (VG) dataset

Community:

  • [ 1 ] [Chen Y.]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya, 572000, China
  • [ 2 ] [Chen Y.]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan, 430070, China
  • [ 3 ] [Chen Y.]Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
  • [ 4 ] [Zhan L.]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya, 572000, China
  • [ 5 ] [Zhan L.]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan, 430070, China
  • [ 6 ] [Zhan L.]Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
  • [ 7 ] [Zhao Y.]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya, 572000, China
  • [ 8 ] [Zhao Y.]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan, 430070, China
  • [ 9 ] [Zhao Y.]Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
  • [ 10 ] [Xiong S.]Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
  • [ 11 ] [Xiong S.]Interdisciplinary Artificial Intelligence Research Institute, Wuhan College, Wuhan, 430212, China
  • [ 12 ] [Lu X.]Fuzhou University, College of Physics and Information Engineering, Fuzhou, 350108, China

Reprint 's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

IEEE Transactions on Geoscience and Remote Sensing

ISSN: 0196-2892

Year: 2025

Volume: 63

7 . 5 0 0

JCR@2023

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Affiliated Colleges:

Online/Total:348/10850474
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1