• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Zhang, Xu (Zhang, Xu.) | Bao, Changchun (Bao, Changchun.) | Zhou, Jing (Zhou, Jing.) | Yang, Xue (Yang, Xue.)

Indexed by:

EI

Abstract:

Recently, the beamforming methods based on the time domain audio separation network (Beam-TasNet) have shown satisfactory performance. For example, the performance of minimum variance distortionless response (MVDR) beamformer can be effectively improved by using the time domain audio separation network (TasNet). However, the reverberation will draw a significant performance degradation to the Beam-TasNet since the multiple reflection sounds damage the time domain features extracted by the TasNet. Fortunately, the recent studies show that the frequency domain features have better anti-interference capability in the reverberation environment. Therefore, this paper proposed a MVDR beamforming method based on the time-frequency domain Dual-Path Recurrent Neural Network (TFDPRNN) for the task of speech separation, and we call it Beam-TFDPRNN. In this method, the TFDPRNN uses a path scanning mechanism to capture the time-frequency features more comprehensively by repeatedly scanning input speech signal in both the time and frequency dimensions. The scanned time-frequency features could describe the characteristics of the speech sources in reverberation environment more robustly. As a result, a better pre-separation result is obtained by the TFDPRNN. Furthermore, by using the pre-separation signals to calculate the spatial covariance matrices, a more robust MVDR beamformer is obtained so that the speech separation in the reverberation environment is achieved efficiently. The experiment results based on the WSJ0-2mix corpus show that the proposed method achieves superior performance compared to the reference methods. © 2023 IEEE.

Keyword:

Beamforming Reverberation Source separation Recurrent neural networks Speech analysis Frequency domain analysis Covariance matrix Time domain analysis

Author Community:

  • [ 1 ] [Zhang, Xu]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing; 100124, China
  • [ 2 ] [Bao, Changchun]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing; 100124, China
  • [ 3 ] [Zhou, Jing]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing; 100124, China
  • [ 4 ] [Yang, Xue]Beijing University of Technology, Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing; 100124, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2023

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count: 2

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 12

Affiliated Colleges:

Online/Total:796/10689522
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.