Query:
学者姓名:鲍长春
Refining:
Year
Type
Indexed by
Source
Complex
Co-Author
Language
Clean All
Abstract :
针对最小方差无失真响应(Minimum Variance Distortionless Response,MVDR)波束形成器对导向矢量失配较敏感的问题,本文提出了一种有效的干扰声源抑制方法.该方法首先将语音信号的频带划分为多个子带,通过聚焦信号子空间方法估计各子带的声源到达方向(Direction of Arrival,DOA),并采用统计直方图估计各声源的初始DOA;其次,为了减小导向矢量失配,利用声源的空间稀疏性,通过Capon功率构建目标声源导向矢量估计的代价函数,约束目标声源导向矢量远离干扰声源空间;最后,根据估计的导向矢量,估计干扰声源加噪声协方差矩阵,以获得MVDR波束形成器的权重.基于TIMIT语料库的实验结果证明,提出的干扰声源抑制方法的输出信干噪比(SINR)及语音质量感知评价(PESQ)优于参考方法,具有更佳的抗导向矢量失配性能.
Keyword :
最小方差无失真响应 最小方差无失真响应 语音增强 语音增强 聚焦信号子空间 聚焦信号子空间 麦克风阵列 麦克风阵列 波束形成 波束形成
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 周静 , 鲍长春 , 张旭 . 基于聚焦信号子空间估计导向矢量的干扰声源抑制方法 [J]. | 电子学报 , 2023 , 51 (1) : 76-85 . |
MLA | 周静 等. "基于聚焦信号子空间估计导向矢量的干扰声源抑制方法" . | 电子学报 51 . 1 (2023) : 76-85 . |
APA | 周静 , 鲍长春 , 张旭 . 基于聚焦信号子空间估计导向矢量的干扰声源抑制方法 . | 电子学报 , 2023 , 51 (1) , 76-85 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Most of the existing Beamforming methods are based on the assumptions that the sources are all point sources and the angular separation between the direction of arrival (DOA) of the source and the interference is large enough to assure good performance. In this paper, we consider a tough scenario where the target source and the interference are simultaneously spatially distributed and overlapped. To improve the performance of Beamforming in this scenario, we propose two approaches: the first approach exploits the non-Gaussianity as well as the spectrogram sparsity of the output of the microphone array; the second approach exploits the generalized sparsity with overlapped groups of the Beampattern. The proposed criteria are solved by methods based on linearized preconditioned alternating direction method of multipliers (LPADMM) with high accuracy and high computational efficiency. Numerical simulations and real data experiments show the advantages of the proposed approaches compared to previously proposed Beamforming methods for signal enhancement.
Keyword :
Speech enhancement Speech enhancement microphone array microphone array DOA DOA Correlation Correlation speech enhancement speech enhancement Interference Interference Arrays Arrays Array signal processing Array signal processing Signal to noise ratio Signal to noise ratio MVDR MVDR Microphone arrays Microphone arrays
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xiong Wenmeng , Bao Changchun , Jia Maoshen et al. Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources [J]. | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING , 2022 , 30 : 2778-2790 . |
MLA | Xiong Wenmeng et al. "Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources" . | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 30 (2022) : 2778-2790 . |
APA | Xiong Wenmeng , Bao Changchun , Jia Maoshen , Picheral, Jose . Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources . | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING , 2022 , 30 , 2778-2790 . |
Export to | NoteExpress RIS BibTex |
Abstract :
一种基于CTC多层损失的语音识别方法,属于模式识别、声学领域。该方法对语音识别网络不同层的输出进行规范,使不同层的输出尽量接近所需要的语音识别结果,从而提高语音识别的性能。该方法包括模型训练与模型测试两个阶段:在训练阶段,将预处理后的训练集输入所搭建的多层语音识别网络中,计算不同层的损失和不同层的权重,将不同层损失加权求和得到多层损失,循环计算损失,更新网络参数直至收敛;在测试阶段,将预处理后的测试集输入训练好的多层语音识别网络,输出识别结果。本发明仅仅改变CTC语音识别模型训练阶段的损失函数,并不改变CTC语音识别模型的结构及其语音识别的过程,以低复杂度、低开销的特点提高语音识别的准确率。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 陈仙红 , 罗德雨 , 鲍长春 . 一种基于CTC多层损失的语音识别方法 : CN202210619908.5[P]. | 2022-06-02 . |
MLA | 陈仙红 et al. "一种基于CTC多层损失的语音识别方法" : CN202210619908.5. | 2022-06-02 . |
APA | 陈仙红 , 罗德雨 , 鲍长春 . 一种基于CTC多层损失的语音识别方法 : CN202210619908.5. | 2022-06-02 . |
Export to | NoteExpress RIS BibTex |
Abstract :
自回归(AR)模型是一类描述时序序列相关性的有效方法,经典的AR系数估计方法对残差信号做了简单的假设,在噪声干扰等复杂场景中难以准确估计AR系数,而基于深度神经网络(DNN)的AR(DNN-AR)系数估计方法在训练中容易受到莱文逊-杜宾迭代(LDR)解法的数值稳定性的影响.为改善DNN-AR系数训练的稳定性和整体性能,在保证系统稳定性的前提下,本文利用精度转化提高系统运算速度的思路,提出了基于广义合成分析(GABS)模型的深度网络结构改善方法,提高了AR系数在含噪环境下估计的准确性和网络训练的稳定性.组合DNN的GABS(GABS-DNN)的模型由三个主要部分组成:修正器的谱增强网络、编码器的...
Keyword :
深度神经网络 深度神经网络 广义合成分析 广义合成分析 AR系数 AR系数 莱文逊-杜宾迭代解 莱文逊-杜宾迭代解
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 崔子豪 , 鲍长春 . 基于广义合成分析和深度神经网络的自回归系数估计方法 [J]. | 电子学报 , 2021 , 49 (01) : 29-39 . |
MLA | 崔子豪 et al. "基于广义合成分析和深度神经网络的自回归系数估计方法" . | 电子学报 49 . 01 (2021) : 29-39 . |
APA | 崔子豪 , 鲍长春 . 基于广义合成分析和深度神经网络的自回归系数估计方法 . | 电子学报 , 2021 , 49 (01) , 29-39 . |
Export to | NoteExpress RIS BibTex |
Abstract :
基于广义合成分析和深度神经网络的自回归系数估计方法
Keyword :
广义合成分析 广义合成分析 莱文逊-杜宾迭代解 莱文逊-杜宾迭代解 深度神经网络 深度神经网络 AR系数 AR系数
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 崔子豪 , 鲍长春 , 电子学报 . 基于广义合成分析和深度神经网络的自回归系数估计方法 [J]. | 崔子豪 , 2021 , 49 (1) : 29-39 . |
MLA | 崔子豪 et al. "基于广义合成分析和深度神经网络的自回归系数估计方法" . | 崔子豪 49 . 1 (2021) : 29-39 . |
APA | 崔子豪 , 鲍长春 , 电子学报 . 基于广义合成分析和深度神经网络的自回归系数估计方法 . | 崔子豪 , 2021 , 49 (1) , 29-39 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Multiple sound source localization is a hot issue of concern in recent years. The Single Source Zone (SSZ) based localization methods achieve good performance due to the detection and utilization of the Time-Frequency (T-F) zone where only one source is dominant. However, some T-F points consisting of components from multiple sources are also included in the detected SSZ sometimes. Once a T-F point in SSZ is contributed by multiple components, this point is defined as an outlier. The existence of outliers within the detected SSZ is usually an unavoidable problem for SSZ-based methods. To solve this problem, a multi-source localization by using offset residual weight is proposed in this paper. In this method, an assumption is developed: the direction estimated by all the T-F points within the detected SSZ has a difference along with the actual direction of sources. But this difference is much smaller than the difference between the directions estimated by the outliers along with the actual source localization. After verifying this assumption experimentally, Point Offset Residual Weight (PORW) and Source Offset Residual Weight (SORW) are proposed to reduce the influence of outliers on the localization results. Then, a composite weight is formed by combining PORW and SORW, which can effectively distinguish the outliers and desired points. After that, the outliers are removed by composite weight. Finally, a statistical histogram of DOA estimation with outliers removed is used for multi-source localization. The objective evaluation of the proposed method is conducted in various simulated environments. The results show that the proposed method achieves a better performance compared with the reference methods in sources localization.
Keyword :
Multiple sound sources localization Multiple sound sources localization Direction of arrival estimation Direction of arrival estimation Soundfield microphone Soundfield microphone Reverberation Reverberation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight [J]. | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) . |
MLA | Jia, Maoshen et al. "Multi-source localization by using offset residual weight" . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2021 . 1 (2021) . |
APA | Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Multiple sound source localization is a hot topic of concern in recent years. In this paper, a multi-source localization method based on weight clustering and outlier removal is proposed to deal with the multiple source localization in the environment with high reverberation time. In this kind of environments, there are always some T-F points consisting of components from multiple sources mixed in the detected spares components. These T-F points, which are called outliers, usually carry the wrong information of localization and could lead to the decline of localization accuracy. To solve this problem, the Point Offset Residual Weight (PORW) and Source Offset Residual Weight (SORW) are introduced to measure the contribution of each T-F point to the localization. The binary clustering is proposed to distinguish and remove the outliers. After that, a statistical histogram of DOA estimation is drawn using the composite weight to weaken the effect of components that interfere with the localization. Finally, the multi-source localization is conducted through peak searching. The objective evaluation of the proposed method is conducted in various simulated environments. The results show that the proposed method achieves a better performance compared with the reference methods in sources localization.
Keyword :
reverberation reverberation direction of arrival estimation direction of arrival estimation multiple sources localization multiple sources localization sound field microphone sound field microphone sparsity sparsity
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Gao, Shang , Jia, Maoshen , Bao, Changchun . A multi-source localization method based on clustering and outlier removal [J]. | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) , 2021 : 950-955 . |
MLA | Gao, Shang et al. "A multi-source localization method based on clustering and outlier removal" . | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) (2021) : 950-955 . |
APA | Gao, Shang , Jia, Maoshen , Bao, Changchun . A multi-source localization method based on clustering and outlier removal . | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) , 2021 , 950-955 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Individual recognition among instruments of the same type is a challenging problem and it has been rarely investigated. In this study, the individual recognition of violins is explored. Based on the source-filter model, the spectrum can be divided into tonal content and nontonal content, which reflects the timbre from complementary aspects. The tonal/nontonal gammatone frequency cepstral coefficients (GFCC) are combined to describe the corresponding spectrum contents in this study. In the recognition system, Gaussian mixture models-universal background model (GMM-UBM) is employed to parameterize the distribution of the combined features. In order to evaluate the recognition task of violin individuals, a solo dataset including 86 violins is developed in this study. Compared with other features, the combined features show a better performance in both individual violin recognition and violin grade classification. Experimental results also show the GMM-UBM outperforms the CNN, especially when the training data are limited. Finally, the effect of players on the individual violin recognition is investigated.
Keyword :
tonal/nontonal content tonal/nontonal content violin grade classification violin grade classification individual violin recognition individual violin recognition Gaussian mixture models-universal background model Gaussian mixture models-universal background model
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, Qi , Bao, Changchun . Individual Violin Recognition Method Combining Tonal and Nontonal Features [J]. | ELECTRONICS , 2020 , 9 (6) . |
MLA | Wang, Qi et al. "Individual Violin Recognition Method Combining Tonal and Nontonal Features" . | ELECTRONICS 9 . 6 (2020) . |
APA | Wang, Qi , Bao, Changchun . Individual Violin Recognition Method Combining Tonal and Nontonal Features . | ELECTRONICS , 2020 , 9 (6) . |
Export to | NoteExpress RIS BibTex |
Abstract :
In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion. © 2020 IEEE.
Keyword :
Deep neural networks Deep neural networks Time series analysis Time series analysis Speech processing Speech processing Parameter estimation Parameter estimation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Cui, Zihao , Bao, Changchun , Nielsen, Jesper Kjar et al. Autoregressive Parameter Estimation with Dnn-Based Pre-Processing [C] . 2020 : 6759-6763 . |
MLA | Cui, Zihao et al. "Autoregressive Parameter Estimation with Dnn-Based Pre-Processing" . (2020) : 6759-6763 . |
APA | Cui, Zihao , Bao, Changchun , Nielsen, Jesper Kjar , Grasboll Christensen, Mads . Autoregressive Parameter Estimation with Dnn-Based Pre-Processing . (2020) : 6759-6763 . |
Export to | NoteExpress RIS BibTex |
Abstract :
本发明提出一种基于交互式注意力模型的语音情感识别方法,属于语音信号处理、情感识别与机器学习领域。文本信息和声学信息是语音包含的两种重要信息,这两种信息对情感识别有重要作用。与现有语音情感识别技术相比,本发明同时利用文本和声学两个模态进行情感识别,包括语音预处理、语音识别、词向量提取、强制对齐、词级别声学特征提取、表示学习、模态融合、情感分类几个步骤。表示学习阶段提出交互式注意力模型,在词层面用一个模态的信息来帮助学习另一个模态的情感表示。模态融合阶段,在语句层面学习文本和模态的互补信息。本发明充分利用两个模态不同层次的互补信息进行情感识别,有效地提高语音情感识别的准确率。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 陈仙红 , 鲍长春 . 一种基于交互式注意力模型的语音情感识别方法 : CN202011521398.5[P]. | 2020-12-21 . |
MLA | 陈仙红 et al. "一种基于交互式注意力模型的语音情感识别方法" : CN202011521398.5. | 2020-12-21 . |
APA | 陈仙红 , 鲍长春 . 一种基于交互式注意力模型的语音情感识别方法 : CN202011521398.5. | 2020-12-21 . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |