Indexed by:
Abstract:
This paper proposes an approach to improve both the target speaker's individuality and the quality of the converted speech by preparing the training data. In mixture Gaussian spectral mapping (MGM) based voice conversion, spectral features representations are analyzed to obtain the right feature associations between the source and target characteristics. A voiced and unvoiced (V/UV) decision scheme for time-alignment is provided to obtain the right data for training mixture Gaussian spectral mapping function while removing the misaligned data. Experiments are conducted in terms of the applications of spectral representation methods and V/UV decisions strategies to the MGM functions. When linear predictive cepstral coefficients (LPCC) are used for time-alignment and the V/UV decisions are adopted for removing bad data, results show that the conversion function can get a better accuracy and the proposed method can effectively improve the overall performance of voice conversion. ©2004 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2004
Page: 181-184
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: