• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Yang, Xue (Yang, Xue.) | Bao, Changchun (Bao, Changchun.) | Zhou, Jing (Zhou, Jing.) | Zhang, Xu (Zhang, Xu.) | Duan, Haiwei (Duan, Haiwei.) | Zhao, Yunhao (Zhao, Yunhao.) | Li, Wenwen (Li, Wenwen.)

Indexed by:

EI

Abstract:

Multichannel speech processing has been widely studied since the spatial information contained in the multichannel signals can be exploited. To facilitate the efficient transmission and preserve the spatial information, the multichannel speech coding technique is needed. Recently, a multichannel speech coding method based on the Opus codec and spatial parameters was proposed. In the encoding stage, the speech signal of the reference channel is encoded with the Opus codec. The multichannel speech signals are decomposed through the Gammatone filter bank and the spatial parameters are extracted and quantized for each sub-band signals. In the decoding stage, the encoded signal of the reference channel is decoded with the Opus codec. Subsequently, this decoded signal is combined with the quantized spatial parameters to recover the speech signals of the remaining channels. In this paper, an improved implementation of this coding method is detailed. Specifically, a framing pattern more suitable for multichannel speech coding with the Gammatone filter bank is proposed. Besides, a newly designed window is then employed on each sub-band signal for the precise extraction of spatial parameters in the frequency domain. Additionally, the extracted spatial parameters are quantized non-uniformly. The experimental results show the effectiveness of the proposed implementation. This implementation can achieve high speech quality with a reduced bitrate. Furthermore, the spatial information contained in the multichannel speech signals can be better preserved. © 2024 IEEE.

Keyword:

Audio signal processing Channel coding Decoding Frequency domain analysis Quantization (signal) Speech enhancement Signal encoding Encoding (symbols) Microphone array Image coding

Author Community:

  • [ 1 ] [Yang, Xue]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 2 ] [Bao, Changchun]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 3 ] [Zhou, Jing]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 4 ] [Zhang, Xu]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 5 ] [Duan, Haiwei]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 6 ] [Zhao, Yunhao]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China
  • [ 7 ] [Li, Wenwen]Institute of Speech and Audio Information Processing, Beijing University of Technology, Faculty of Information Technology, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2024

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 9

Affiliated Colleges:

Online/Total:600/10616385
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.