• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Ke, X. (Ke, X..) [1] (Scholars:柯逍) | Chen, B. (Chen, B..) [2] | Cai, Y. (Cai, Y..) [3] | Liu, H. (Liu, H..) [4] | Guo, W. (Guo, W..) [5] | Chen, W. (Chen, W..) [6]

Indexed by:

Scopus

Abstract:

There are huge differences in data distribution and feature representation of different modalities. How to flexibly and accurately retrieve data from different modalities is a challenging problem. The mainstream common subspace methods only focus on the heterogeneity gap, and use a unified method to jointly learn the common representation of different modalities, which can easily lead to the difficulty of multi-modal unified fitting. In this work, we innovatively propose the concept of multi-modal information density discrepancy, and propose a modality-specific adaptive scaling method incorporating prior knowledge, which can adaptively learn the most suitable network for different modalities. Secondly, for the problem of efficient semantic fusion and interference features, we propose a multi-level modal feature attention mechanism, which realizes the efficient fusion of text semantics through attention mechanism, explicitly captures and shields the interference features from multiple scales. In addition, to address the bottleneck of cross-modal retrieval task caused by the insufficient quality of multimodal common subspace and the defects of Transformer structure, this paper proposes a cross-level interaction injection mechanism to fuse multi-level patch interactions without affecting the pre-trained model to construct higher quality latent representation spaces and multimodal common subspaces. Comprehensive experimental results on four widely used cross-modal retrieval datasets show the proposed MASAN achieves the state-of-the-art results and significantly outperforms other existing methods. © 2024 Elsevier B.V.

Keyword:

Attention mechanism Common representation learning Cross-modal retrieval

Community:

  • [ 1 ] [Ke X.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 2 ] [Ke X.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China
  • [ 3 ] [Chen B.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 4 ] [Chen B.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China
  • [ 5 ] [Cai Y.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 6 ] [Cai Y.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China
  • [ 7 ] [Liu H.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 8 ] [Liu H.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China
  • [ 9 ] [Guo W.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 10 ] [Guo W.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China
  • [ 11 ] [Chen W.]Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Computer and Data Science, Fuzhou University, Fuzhou, 350116, China
  • [ 12 ] [Chen W.]Key Laboratory of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University, Fuzhou, 350116, China

Reprint 's Address:

Email:

Show more details

Related Keywords:

Source :

Neurocomputing

ISSN: 0925-2312

Year: 2025

Volume: 612

5 . 5 0 0

JCR@2023

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 0

Affiliated Colleges:

Online/Total:162/10064411
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1