Overlapping speech detection using high-level information features - Details

Author：

Ma, Yong (Ma, Yong.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春)

Indexed by：

EI Scopus PKU CSCD

Abstract：

Overlapping　speech　is　one　of　the　main　factors　influencing　the　performance　of　speaker　segmentation.　This　paper　presents　an　overlapping　speech　detection　method　using　a　high-level　information　feature　to　improve　the　speaker　segmentation　results.　A　linguistic　high-level　information　feature　of　the　speech　is　extracted　using　the　universal　background　model　(UBM).　Then,　a　hidden　Markov　model　(HMM)　is　trained　using　the　Mel　frequency　cepstral　coefficients　(MFCC)　and　the　high-level　information　to　detect　overlapping　speech.　The　result　is　then　used　for　the　speaker　segmentation　of　the　pre-processed　speech.　Tests　on　a　dataset　generated　from　the　TIMIT　database　show　that　the　error　ratio　for　overlapping　speech　detection　is　significantly　lower　than　the　reference　method　using　just　the　MFCC　feature.　The　speaker　segmentation　is　also　significantly　improved.　©　2017,　Tsinghua　University　Press.　All　right　reserved.

Keyword：

Statistical tests Speech Hidden Markov models Speech recognition Feature extraction Information use

Author Community：

[ 1 ] [Ma, Yong]School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Ma, Yong]School of Physics and Electronic Engineering, Jiangsu Normal University, Xuzhou; 221009, China
[ 3 ] [Bao, Changchun]School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address：

鲍长春
[bao, changchun]school of electronic information and control engineering, beijing university of technology, beijing; 100124, china

Email：

baochch@bjut.edu.cn

Show more details

Related Keywords：

Visual speech synthesis based on learning model
2009，Journal of Beijing University of Technology
A HMM-based method for anomaly detection
2011，2011 4th IEEE International Conference on Broadband Network and Multimedia Technology, IC-BNMT 2011
Prior-Guided Data Augmentation for Infrared Small Target Detection
2022，IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Atrial fibrillation detection by multi-scale convolutional neural networks
2017，20th International Conference on Information Fusion, Fusion 2017

Source ：

Journal of Tsinghua University

ISSN： 1000-0054

Year： 2017

Issue： 1

Volume： 57

Page： 79-83

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 6

Affiliated Colleges：

信息科学技术学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to