• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Zhang, Xu (Zhang, Xu.) | Bao, Changchun (Bao, Changchun.) | Yang, Xue (Yang, Xue.) | Zhou, Jing (Zhou, Jing.)

Indexed by:

EI Scopus SCIE

Abstract:

The combination of neural networks and beamforming has proven very effective in multi-channel speech separation, but its performance faces a challenge in complex environments. In this paper, an iteratively refined multi-channel speech separation method is proposed to meet this challenge. The proposed method is composed of initial separation and iterative separation. In the initial separation, a time-frequency domain dual-path recurrent neural network (TFDPRNN), minimum variance distortionless response (MVDR) beamformer, and post-separation are cascaded to obtain the first additional input in the iterative separation process. In iterative separation, the MVDR beamformer and post-separation are iteratively used, where the output of the MVDR beamformer is used as an additional input to the post-separation network and the final output comes from the post-separation module. This iteration of the beamformer and post-separation is fully employed for promoting their optimization, which ultimately improves the overall performance. Experiments on the spatialized version of the WSJ0-2mix corpus showed that our proposed method achieved a signal-to-distortion ratio (SDR) improvement of 24.17 dB, which was significantly better than the current popular methods. In addition, the method also achieved an SDR of 20.2 dB on joint separation and dereverberation tasks. These results indicate our method's effectiveness and significance in the multi-channel speech separation field.

Keyword:

microphone array iterative separation speech separation minimum variance distortionless response (MVDR) beamforming

Author Community:

  • [ 1 ] [Zhang, Xu]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Bao, Changchun]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Yang, Xue]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
  • [ 4 ] [Zhou, Jing]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China

Reprint Author's Address:

  • [Bao, Changchun]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China;;

Show more details

Related Keywords:

Source :

APPLIED SCIENCES-BASEL

Year: 2024

Issue: 14

Volume: 14

2 . 7 0 0

JCR@2022

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 8

Affiliated Colleges:

Online/Total:874/10660002
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.