Indexed by:
Abstract:
Speech enhancement in the noisy and reverberant environment remains a challenging task. Acoustic beamforming algorithm with minimum variance distortionless response (MVDR) has shown to be effective for this case. The crucial issue in MVDR-based speech enhancement is to get accurate estimates of the speech and noise spatial covariance matrices (SCMs). On this way, time-frequency mask-based method which is a reliable method to estimate the SCMs can improve the performance of the MVDR beamformer in speech enhancement. In this paper, an optimal ratio mask-based method used for MVDR beamforming is proposed. Specifically, the convolutional neural networks (CNNs) is used in the proposed method, which operates on the magnitude and phase components of the short-time Fourier transform (STFT) of microphones to estimate the optimal ratio masks, and these masks are used to get the SCMs for constructing MVDR beamformer. Experiments are conducted by using simulated data. The results show that the proposed method is more robust than the reference methods against the terrible acoustic conditions.
Keyword:
Reprint Author's Address:
Source :
CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019)
Year: 2019
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: