• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Chen, Yaxiong (Chen, Yaxiong.) [1] | Zhan, Liwen (Zhan, Liwen.) [2] | Zhao, Yichen (Zhao, Yichen.) [3] | Xiong, Shengwu (Xiong, Shengwu.) [4] | Lu, Xiaoqiang (Lu, Xiaoqiang.) [5]

Indexed by:

EI

Abstract:

This article introduces a task named visual grounding of remote sensing ship (VGRSS) images. The goal of VGRSS is to locate ship objects in remote sensing images guided by natural language. Extensive research has been conducted on multimodal processing of remote sensing images and text to retrieve rich information from remote sensing images using natural language. However, due to the unique characteristics of remote sensing ship images, ship localization using natural language remains a challenge. Therefore, in this work, we construct datasets for the VGRSS task and explore deep learning models. Specifically, our contributions can be summarized as follows: first, we construct two remote sensing ship datasets for visual grounding. One is based on the optical remote sensing ship target detection benchmark dataset, named RSSVG, while the other is based on the synthetic aperture radar (SAR) dataset, named SARVG. Second, we propose a language-guided visual feature enhancement (LVFE) module. This module enhances visual features through language guidance before visual-linguistic fusion (VLF). Third, we propose a VLF module based on multimodal feature stacking. This module inputs the stacked language and visual features, and then performs feature fusion using a Transformer, enabling effective cross-modal interaction and integration. Fourth, we introduce a novel loss calculation method by incorporating enhanced intersection over union (EIoU) into the loss function. Finally, we benchmark extensive state-of-the-art (SOTA) natural image visual grounding (VG) methods on the constructed RSSVG and SARVG datasets, then provide insightful analysis based on the results. This work offers valuable insights for developing better VGRSS models. © 1980-2012 IEEE.

Keyword:

Deep learning Image enhancement Linguistics Modeling languages Natural language processing systems Problem oriented languages Ships Visual languages

Community:

  • [ 1 ] [Chen, Yaxiong]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya; 572000, China
  • [ 2 ] [Chen, Yaxiong]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
  • [ 3 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
  • [ 4 ] [Zhan, Liwen]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya; 572000, China
  • [ 5 ] [Zhan, Liwen]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
  • [ 6 ] [Zhan, Liwen]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
  • [ 7 ] [Zhao, Yichen]Wuhan University of Technology, Sanya Science and Education Innovation Park, Sanya; 572000, China
  • [ 8 ] [Zhao, Yichen]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
  • [ 9 ] [Zhao, Yichen]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
  • [ 10 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
  • [ 11 ] [Xiong, Shengwu]Interdisciplinary Artificial Intelligence Research Institute, Wuhan College, Wuhan; 430212, China
  • [ 12 ] [Lu, Xiaoqiang]Fuzhou University, College of Physics and Information Engineering, Fuzhou; 350108, China

Reprint 's Address:

  • [xiong, shengwu]interdisciplinary artificial intelligence research institute, wuhan college, wuhan; 430212, china;;[xiong, shengwu]shanghai artificial intelligence laboratory, shanghai; 200232, china

Show more details

Related Keywords:

Related Article:

Source :

IEEE Transactions on Geoscience and Remote Sensing

ISSN: 0196-2892

Year: 2025

Volume: 63

7 . 5 0 0

JCR@2023

CAS Journal Grade:1

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 4

Affiliated Colleges:

Online/Total:267/10850942
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1