• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Zhang, Yu (Zhang, Yu.) | Jia, Maoshen (Jia, Maoshen.) | Gao, Shang (Gao, Shang.) | Wang, Shusen (Wang, Shusen.)

Indexed by:

EI Scopus

Abstract:

Sound source separation is a hot topic in array signal processing. In fact, it is difficult to separate multiple sound sources in reverberant environments. To solve this problem, a two-stage multiple sound source separation method is proposed based on the fully-convolutional time-domain audio separation network (Conv-TasNet) and deep neural network (DNN). In the first stage, the end-to-end separation network (i.e., the Conv-TasNet) is employed to separate the sound sources from the signal recorded in reverberant environments. We use the encoder to generate the time domain representation of the recorded signal, and the mask is obtained by the Conv-TasNet model. The recovery of each separated sound source signal is conducted by multiplying the output of the encoder by the mask. In the second stage, the separated signal is enhanced with the single DNN. The training target of the DNN is to obtain the ideal enhancement mask, which is calculated by integrating the amplitude of the frequency domain coefficients of both separated signal and clean signal. The amplitude of the frequency domain coefficients of the separated signals is used as the input of DNN in order to predict the ideal ratio mask (IRM). IRM is multiplied by the original amplitude to obtain the enhanced amplitude of the frequency domain coefficients, which is integrated with phase to obtain the enhanced source signal. The results of the subjective and objective evaluation show that, compared with the reference methods, the proposed method achieves better separation quality in both reverberant and anechoic acoustic environments. © 2021 IEEE.

Keyword:

Deep neural networks Frequency domain analysis Audio acoustics Time domain analysis Signal encoding Reverberation Array processing Separation Source separation Acoustic generators

Author Community:

  • [ 1 ] [Zhang, Yu]Faculty of Information Technology, Beijing University of Technology, Beijing, China
  • [ 2 ] [Jia, Maoshen]Faculty of Information Technology, Beijing University of Technology, Beijing, China
  • [ 3 ] [Gao, Shang]Faculty of Information Technology, Beijing University of Technology, Beijing, China
  • [ 4 ] [Wang, Shusen]Audio Engineering Society (AES) Beijing Section, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2021

Page: 264-269

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count: 1

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 7

Affiliated Colleges:

Online/Total:446/10588894
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.