• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Duan, Haiwei (Duan, Haiwei.) | Bao, Changchun (Bao, Changchun.) | Zhou, Jing (Zhou, Jing.)

Indexed by:

EI

Abstract:

The deep learning (DL) based direction-of-arrival (DOA) estimation is one of the research hotspots, and many methods have been proposed recently. However, most of those methods will face serious performance degradation, since the adverse impacts caused by the sources overlapping, noise and reverberation. One of the primary impacts is that the performance degradation is susceptible to some pre-extracted features that often result in spectral aliasing and peak confusion in a complex scenario. In this paper, a new feature stacked with the log-Mel spectrum and the noise subspace of the covariance matrix of the relative sound pressure is proposed and further used for the DL-based DOA estimation, which is referred to log-Mel spectrum augmented noise subspace (LMNS). The LMNS is more robust compared with the conventional features since it can represent both spectral and spatial information effectively. Meanwhile, the LMNS is used as the input feature and fed to a Conformer based residual network to map the spatial pseudo-spectrum, thereby the DOAs of the sound sources can be obtained. The experimental results show that the proposed method has better performance on the DOA estimation, which verifies that the proposed feature LMNS is more robust and effective in the scenarios with multi-source, noise and reverberation. © 2023 IEEE.

Keyword:

Deep learning Reverberation Direction of arrival Covariance matrix Acoustic noise

Author Community:

  • [ 1 ] [Duan, Haiwei]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing, China
  • [ 2 ] [Bao, Changchun]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing, China
  • [ 3 ] [Zhou, Jing]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Source :

Year: 2023

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 6

Affiliated Colleges:

Online/Total:530/10582551
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.