Indexed by:
Abstract:
[Objective] Estimation of direction of arrival (DOA) is critical in spatial audio coding, speech enhancement, sound field synthesis, and sound source imaging. Commonly used signal model-based DOA estimation methods, such as the multiple signal classification method, can effectively estimate DOA information in noise-free and anechoic scenarios. However, real-world environments always have noise and reverberation, particularly in far-field speech communication scenarios characterized by low signal-to-noise ratios and strong reverberation. Furthermore, the sound source may be in motion. These factors considerably impair the performance of DOA estimation methods based on signal models. To address this issue, this paper introduces a real-time estimation and tracking method for the DOA of a single sound source, using Kalman filtering and frequency focusing. [Methods] The proposed method consists of three procedures: denoising, dereverberation, and DOA estimation. With regard to the denoising procedure, an objective optimization function to minimize the error of the denoised signal is established. This function is solved using a Kalman filter, which leads to obtaining the denoised signal through Kalman gain-based posterior estimation. For the dereverberation procedure, based on the autoregressive coefficients of the late reverberation components, an objective optimization function to minimize the error of the multichannel linear prediction (MCLP) coefficients is established. This function is also solved through another Kalman filter to obtain the MCLP coefficients. The DOA estimation procedure is implemented by using a frequency focusing based steered response power (FF-SRP) method, which can circumvent signal component diffusion within subspace decomposition. In particular, a structure that effectively intertwines these three procedures, enhancing the contribution of denoising and dereverberation results to DOA estimation. In this structure, a propagation matrix is utilized to integrate the denoising and dereverberation procedures, creating a causative iteration between them. Subsequently, a minimum variance distortionless response (MVDR) beamforming method is used to replace the multichannel Wiener filtering method. This is to obtain a prior estimation of the covariance matrix of the target signal. The MVDR beamforming method offers two advantages: it reduces the distortion of the target signal and integrates the DOA estimation procedure with the denoising procedure, thereby promoting a causal and orderly iteration among the three procedures. [Results] Experiments were conducted using a microphone array signal simulator and the TIMIT corpus. The mean absolute error (MAE) of the estimated DOA, along with the DOA track of the moving speaker, served as the evaluation measures. Experimental results revealed several key findings: (1) As RT60 increased, the MAE of all methods increased, clearly demonstrating that reverberation significantly affects DOA estimation performance. (2) Compared with the reference methods, the proposed method consistently delivered the lowest MAE values under different RT60s and SNRs. This suggests that the proposed method has higher accuracy in DOA estimation. (3) In terms of DOA trajectory, the proposed method again outperformed the reference methods by producing the smallest error. This indicates that the proposed method has better performance in DOA tracking. [Conclusions] By integrating denoising, dereverberation, and DOA estimation through a causal and recursive iteration structure, the performance of DOA estimation and tracking can be significantly enhanced. The proposed method effectively mitigates the detrimental impact of noise and reverberation on DOA estimation and tracking accuracy in single sound source scenarios. © 2024 Tsinghua University. All rights reserved.
Keyword:
Reprint Author's Address:
Email:
Source :
Journal of Tsinghua University
ISSN: 1000-0054
Year: 2024
Issue: 11
Volume: 64
Page: 1902-1910
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 18
Affiliated Colleges: