LLaVA-based semantic feature modulation diffusion model for underwater image enhancement - Details

author：

Fan, Guodong (Fan, Guodong.) ^[1] | Zhou, Shengning (Zhou, Shengning.) ^[2] | Hua, Zhen (Hua, Zhen.) ^[3] | Li, Jinjiang (Li, Jinjiang.) ^[4] | Zhou, Jingchun (Zhou, Jingchun.) ^[5]

Indexed by：

Abstract：

Underwater　Image　Enhancement　(UIE)　is　critical　for　numerous　marine　applications;　however,　existing　methods　often　fall　short　in　addressing　severe　color　distortion,　detail　loss,　and　lack　of　semantic　understanding,　particularly　under　spatially　varying　degradation　conditions.　While　Generative　AI　(GenAI),　particularly　diffusion　models　and　multimodal　large　language　models　(MLLMs),　offers　new　prospects　for　UIE,　effectively　leveraging　their　capabilities　for　fine-grained,　semantic-aware　enhancement　remains　a　challenge.　We　proposed　a　LLaVA-based　semantic　feature　modulation　diffusion　model　(LSFM-Diff),　which　integrates　multi-level　semantic　guidance　collaboratively　into　the　backbone　network　of　the　diffusion　model.　Specifically,　an　optimized　prompt　learning　strategy　is　first　employed　to　obtain　concise,　UIE-relevant　textual　descriptions　from　LLaVA.　These　semantics　then　guide　the　enhancement　process　in　two　key　stages:　(1)　The　windowed　text-image　fusion　for　condition　refinement　(WTIF-CR)　module　aligns　and　fuses　textual　semantics　with　local　image　features　spatially,　generating　fine-grained　external　conditions　that　provide　an　initial　spatially　aware　semantic　blueprint　for　the　diffusion　model.　(2)　The　semantic-guided　deformable　attention　(SGDA)　mechanism,　leveraging　a　gradient-based　image-text　interaction　to　generate　a　semantic　navigation　map,　guides　the　attention　within　the　denoising　network　to　focus　on　key　semantic　regions.　Experiments　conducted　on　several　challenging　benchmark　datasets　demonstrate　that　LSFM-Diff　outperforms　current　state-of-the-art　methods.　Our　work　highlights　the　effectiveness　of　deep　integration　of　multi-level　semantic　guidance　fusion　strategies　in　advancing　GenAI-based　UIE　development.　©　2025

Keyword：

Diffusion Image denoising Image enhancement Information fusion Learning systems Marine applications Semantics Semantic Web

Community：

[ 1 ] [Fan, Guodong]School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
[ 2 ] [Fan, Guodong]Fujian Provincial Key Laboratory of Network Computing and Intelligent Processing Fuzhou University, Fuzhou, China
[ 3 ] [Zhou, Shengning]School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
[ 4 ] [Hua, Zhen]School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
[ 5 ] [Li, Jinjiang]School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
[ 6 ] [Li, Jinjiang]School of Computer Science and Technology, Fuzhou University, Fuzhou, China
[ 7 ] [Zhou, Jingchun]School of Computer Science and Technology, Dalian Maritime University, Dalian, China
[ 8 ] [Zhou, Jingchun]School of Computer Science and Technology, Fuzhou University, Fuzhou, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

On a modified diffusion model for noise removal
2012，Journal of Algorithms and Computational Technology
Directed Object Generation Based on Diffusion Model and Locative Word Understanding
2025，5th International Conference on Neural Networks, Information and Communication Engineering, NNICE 2025
Diffusion prior guided deep model driven network for infrared and visible image fusion
2026，Expert Systems with Applications
Underwater dam crack image enhancement and crack detection based on improved diffusion model and SDI-ASF-YOLO11
2025，Construction and Building Materials

Source ：

Information Fusion

ISSN： 1566-2535

Year： 2026

Volume： 126

1 4 . 8 0 0

JCR@2023

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to