Indexed by:
Abstract:
In recent years, sound source localization methods based on Direct Path Interchannel Phase Difference (DP-IPD) estimation have gained considerable attention, owing to their outstanding performance in noisy environments. However, existing methods face several challenges when processing data from multi-element arrays, as the network must learn the complex mapping relationships between signals and features across multiple microphone pairs. These mapping relationships share a certain degree of similarity, which makes it challenging for the network to differentiate them, ultimately impacting localization accuracy. Furthermore, the time-varying nature of spatial cues caused by a moving sound source can further degrade the performance of localization methods. To tackle these challenges, this paper introduces a Time-Frequency Feature Enhanced Convolutional Recurrent Neural Network. By incorporating a frequency attention convolution module and a gated convolution module, the proposed network adaptively handles the mapping of signals to features across different microphone pairs while enhancing its ability to extract local temporal context. This approach improves localization accuracy for moving sound sources in noisy environments. Extensive experimental results show that the proposed method significantly outperforms state-of-the-art approaches on both simulated and real-world datasets. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Keyword:
Reprint 's Address:
Email:
Source :
ISSN: 0302-9743
Year: 2025
Volume: 15858 LNCS
Page: 376-387
Language: English
0 . 4 0 2
JCR@2005
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: