Speech coding based on pitch synchrony and two-stage transformation - Details

Author：

Li, Xiao-Ming (Li, Xiao-Ming.) | Bao, Chang-Chun (Bao, Chang-Chun.) (Scholars：鲍长春) | Kleijn, W.Bastiaan (Kleijn, W.Bastiaan.)

Indexed by：

EI Scopus

Abstract：

In　this　paper,　an　effective　speech　coder　that　is　based　on　a　sparse　representation　of　speech　by　exploiting　the　strong　dependencies　between　adjacent　pitch　cycles　is　proposed.　In　the　proposed　coder,　a　pitch-synchronous　processing　that　consists　of　pitch　warping　and　a　two-stage　transformation　is　used　to　achieve　a　compact　representation　of　the　voiced　speech.　Power　spectral　density　preserving　quantization　(PSD-PQ)　is　adopted　for　quantizing　the　transform　coefficients.　The　result　is　a　coder　that　is　efficient　over　a　wide　range　of　bit　rates:　it　approaches　perfect　reconstruction　with　increasing　rate,　and　has　a　parametric　signal　representation　at　low　rates.　Both　objective　PESQ　results　and　subjective　A/B　listening　tests　show　that　the　proposed　coder　outperforms　the　ITU-T　G.722.1　codec.　©　2013　IEEE.

Keyword：

Speech communication Audio signal processing Spectral density Speech coding Continuous speech recognition

Author Community：

[ 1 ] [Li, Xiao-Ming]Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, China
[ 2 ] [Bao, Chang-Chun]Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, China
[ 3 ] [Kleijn, W.Bastiaan]Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, China
[ 4 ] [Kleijn, W.Bastiaan]School of Engineering and Computer Science, Victoria University of Wellington, New Zealand

Reprint Author's Address：

Email：

Show more details

Related Keywords：

GEV Beamforming with BAN Integrating LPS Estimation and Post-filtering
2020，2020 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2020
Spectral envelope estimation used for audio bandwidth extension based on RBF neural network
2013，2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
DNN-based Multi-Channel Speech Coding Employing Sound Localization
2022，2022 Data Compression Conference, DCC 2022
The design of Ambisonic reproduction system based on dynamic gain parameters
2014，2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

Source ：

ISSN： 1520-6149

Year： 2013

Page： 8159-8163

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 11

Affiliated Colleges：

信息科学技术学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to