Indexed by:
Abstract:
Speech enhancement is a challenge that has not been well solved for a long time. The current research attempts to improve the performance of speech enhancement by microphone array, where the minimum variance and distortionless response (MVDR) beamformer shows a good performance and has been widely studied and applied. In this paper, the convolutional neural network (CNN) is used to estimate the time-frequency (T-F) masking of target speech and noise, which is intended to obtain covariance matrix in the MVDR. On the other hand, the steering vector is estimated through main feature pattern of covariance matrix, instead of estimating the direction of arrival angle or time difference of arrival. In the network training, the training goal is to minimize the mean square error (MSE) between the enhanced speech and real speech, and the loss function is constructed with the MSE. Finally, a microphone array speech simulator is used to generate multi-channel speech and the related simulation experiment is carried out. The experimental results show that the CNN-based MVDR Beamforming method really improved the performance of general MVDR-based beamformer. © 2021 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2021
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 3
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 7
Affiliated Colleges: