• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Chen, Yaxiong (Chen, Yaxiong.) [1] | Zhan, Liwen (Zhan, Liwen.) [2] | Zhao, Yichen (Zhao, Yichen.) [3] | Xiong, Shengwu (Xiong, Shengwu.) [4] | Lu, Xiaoqiang (Lu, Xiaoqiang.) [5]

Indexed by:

SCIE

Abstract:

This article introduces a task named visual grounding of remote sensing ship (VGRSS) images. The goal of VGRSS is to locate ship objects in remote sensing images guided by natural language. Extensive research has been conducted on multimodal processing of remote sensing images and text to retrieve rich information from remote sensing images using natural language. However, due to the unique characteristics of remote sensing ship images, ship localization using natural language remains a challenge. Therefore, in this work, we construct datasets for the VGRSS task and explore deep learning models. Specifically, our contributions can be summarized as follows: first, we construct two remote sensing ship datasets for visual grounding. One is based on the optical remote sensing ship target detection benchmark dataset, named RSSVG, while the other is based on the synthetic aperture radar (SAR) dataset, named SARVG. Second, we propose a language-guided visual feature enhancement (LVFE) module. This module enhances visual features through language guidance before visual-linguistic fusion (VLF). Third, we propose a VLF module based on multimodal feature stacking. This module inputs the stacked language and visual features, and then performs feature fusion using a Transformer, enabling effective cross-modal interaction and integration. Fourth, we introduce a novel loss calculation method by incorporating enhanced intersection over union (EIoU) into the loss function. Finally, we benchmark extensive state-of-the-art (SOTA) natural image visual grounding (VG) methods on the constructed RSSVG and SARVG datasets, then provide insightful analysis based on the results. This work offers valuable insights for developing better VGRSS models.

Keyword:

Accuracy Artificial intelligence Benchmark testing Feature extraction Grounding Language-guided visual feature enhancement (LVFE) Linguistics Marine vehicles Remote sensing Transformer Transformers VG of remote sensing ship (VGRSS) images visual grounding (VG) dataset Visualization

Community:

  • [ 1 ] [Chen, Yaxiong]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 2 ] [Zhan, Liwen]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 3 ] [Zhao, Yichen]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 4 ] [Chen, Yaxiong]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 5 ] [Zhan, Liwen]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 6 ] [Zhao, Yichen]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 7 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 8 ] [Zhan, Liwen]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 9 ] [Zhao, Yichen]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 10 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 11 ] [Xiong, Shengwu]Wuhan Coll, Interdisciplinary Artificial Intelligence Res Inst, Wuhan 430212, Peoples R China
  • [ 12 ] [Lu, Xiaoqiang]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou 350108, Peoples R China

Reprint 's Address:

  • [Xiong, Shengwu]Wuhan Coll, Interdisciplinary Artificial Intelligence Res Inst, Wuhan 430212, Peoples R China

Show more details

Related Keywords:

Related Article:

Source :

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

ISSN: 0196-2892

Year: 2025

Volume: 63

7 . 5 0 0

JCR@2023

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:221/10045380
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1