Indexed by:
Abstract:
Multi-view learning exploits the complementary nature of multiple modalities to enhance performance across diverse tasks. While deep learning has significantly advanced these fields by enabling sophisticated modeling of intra-view and cross-view interactions, many existing approaches still rely on heuristic architectures and lack a principled framework to capture the dynamic evolution of feature representations. This limitation hampers interpretability and theoretical understanding. To address these challenges, this paper introduces the Multi-view State-Space Model (MvSSM), which formulates multi-view representation learning as a continuous-time dynamical system inspired by control theory. In this framework, view-specific features are treated as external inputs, and a shared latent representation evolves as the internal system state, driven by learnable dynamics. This formulation unifies feature integration and label prediction within a single interpretable model, enabling theoretical analysis of system stability and representational transitions. Two variants, MvSSM-Lap and MvSSM-iLap, are further developed using Laplace and inverse Laplace transformations to derive system dynamics representations. These solutions exhibit structural similarities to graph convolution operations in deep networks, supporting efficient feature propagation and theoretical interpretability. Experiments on benchmark datasets such as IAPR-TC12, and ESP demonstrate the effectiveness of the proposed method, achieving up to 4.31 % improvement in accuracy and 4.27 % in F1-score over existing state-of-the-art approaches. © 2025 Elsevier Ltd
Keyword:
Reprint 's Address:
Email:
Source :
Neural Networks
ISSN: 0893-6080
Year: 2026
Volume: 194
6 . 0 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: