Iteratively Refined Multi-Channel Speech Separation - Details

Author：

Zhang, Xu (Zhang, Xu.) | Bao, Changchun (Bao, Changchun.) | Yang, Xue (Yang, Xue.) | Zhou, Jing (Zhou, Jing.)

Indexed by：

EI Scopus SCIE

Abstract：

The　combination　of　neural　networks　and　beamforming　has　proven　very　effective　in　multi-channel　speech　separation,　but　its　performance　faces　a　challenge　in　complex　environments.　In　this　paper,　an　iteratively　refined　multi-channel　speech　separation　method　is　proposed　to　meet　this　challenge.　The　proposed　method　is　composed　of　initial　separation　and　iterative　separation.　In　the　initial　separation,　a　time-frequency　domain　dual-path　recurrent　neural　network　(TFDPRNN),　minimum　variance　distortionless　response　(MVDR)　beamformer,　and　post-separation　are　cascaded　to　obtain　the　first　additional　input　in　the　iterative　separation　process.　In　iterative　separation,　the　MVDR　beamformer　and　post-separation　are　iteratively　used,　where　the　output　of　the　MVDR　beamformer　is　used　as　an　additional　input　to　the　post-separation　network　and　the　final　output　comes　from　the　post-separation　module.　This　iteration　of　the　beamformer　and　post-separation　is　fully　employed　for　promoting　their　optimization,　which　ultimately　improves　the　overall　performance.　Experiments　on　the　spatialized　version　of　the　WSJ0-2mix　corpus　showed　that　our　proposed　method　achieved　a　signal-to-distortion　ratio　(SDR)　improvement　of　24.17　dB,　which　was　significantly　better　than　the　current　popular　methods.　In　addition,　the　method　also　achieved　an　SDR　of　20.2　dB　on　joint　separation　and　dereverberation　tasks.　These　results　indicate　our　method＇s　effectiveness　and　significance　in　the　multi-channel　speech　separation　field.

Keyword：

microphone array iterative separation speech separation minimum variance distortionless response (MVDR) beamforming

Author Community：

[ 1 ] [Zhang, Xu]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
[ 3 ] [Yang, Xue]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
[ 4 ] [Zhou, Jing]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China

Reprint Author's Address：

[Bao, Changchun]Beijing Univ Technol, Inst Speech & Audio Informat Proc, Sch Informat Sci & Technol, Beijing 100124, Peoples R China;;

Email：

xuzhang223@emails.bjut.edu.cn |
baochch@bjut.edu.cn |
yangx11@emails.bjut.edu.cn |
zhoujing@emails.bjut.edu.cn

Show more details

Related Keywords：

Effective Dereverberation with a Lower Complexity at Presence of the Noise
2022，APPLIED SCIENCES-BASEL
Suppression Method of the Interference Sound Sources by Estimated Steering Vector Based on the Focusing Signal Subspace; [基于聚焦信号子空间估计导向矢量的干扰声源抑制方法]
2023，Acta Electronica Sinica
A Beam-TFDPRNN Based Speech Separation Method in Reverberant Environments
2023，2023 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2023
Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources
2022，IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

Source ：

APPLIED SCIENCES-BASEL

Year： 2024

Issue： 14

Volume： 14

2 . 7 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to