Indexed by:
Abstract:
In this paper, we propose a neural speech coding method based on the dual-path conformer, which mainly consists of three steps: (1) the encoding and decoding of the time-frequency spectrum are performed by a structure that combines the CNN and the dual-path conformer, (2) residual vector quantization is employed to quantize the output features of encoder and form a compact discrete representation, and (3) multi-period and multi-scale discriminators are used to improve the perceptual quality of speech during adversarial training. Experimental results, from both subjective and objective evaluations, demonstrate that the proposed codec outperforms the state-of-the-art neural codec AudioDEC and the leading conventional codec Opus in terms of performance. ©2024 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2024
Page: 661-665
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 8
Affiliated Colleges: