Indexed by:
Abstract:
Multimodal Knowledge Graph Completion (MMKGC) involves integrating information from various modalities, such as text and images, into traditional knowledge graphs to enhance their completeness and accuracy. This approach leverages the complementary nature of multimodal data to strengthen the expressive power of knowledge graphs, thereby achieving better performance in tasks like knowledge reasoning and information retrieval. However, knowledge graph completion models designed for the structural information of triples directly applied to the multimodal domain have led to suboptimal model performance. In response to this challenge, this study introduces a novel model called the Multimodal Knowledge Graph Completion Model Based on Modal Hierarchical Fusion (MHF). The MHF model employs a phased fusion strategy that initially learns from structured, visual, and textual modalities independently. Then, it combines structural embeddings with text and image data using a specially designed neural network fusion layer to see how the different types of data interact with each other. Additionally, the MHF model incorporates a semantic constraint layer with a Factor Interaction Regularizer, which enhances the model’s generalization ability by exploiting the semantic equivalence between the head and tail entities of triples. Experimental results on three real-world multimodal benchmark datasets demonstrate that the MHF model achieves excellent performance in link prediction tasks, surpassing the current state-of-the-art baselines, the average performance gain of MRR, Hit@1, and Hit@10 is greater than 5.4%. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
ISSN: 1865-0929
Year: 2025
Volume: 2344 CCIS
Page: 381-395
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 1
Affiliated Colleges: