Query:
学者姓名:鲍长春
Refining:
Year
Type
Indexed by
Source
Complex
Co-Author
Language
Clean All
Abstract :
针对最小方差无失真响应(Minimum Variance Distortionless Response,MVDR)波束形成器对导向矢量失配较敏感的问题,本文提出了一种有效的干扰声源抑制方法.该方法首先将语音信号的频带划分为多个子带,通过聚焦信号子空间方法估计各子带的声源到达方向(Direction of Arrival,DOA),并采用统计直方图估计各声源的初始DOA;其次,为了减小导向矢量失配,利用声源的空间稀疏性,通过Capon功率构建目标声源导向矢量估计的代价函数,约束目标声源导向矢量远离干扰声源空间;最后,根据估计的导向矢量,估计干扰声源加噪声协方差矩阵,以获得MVDR波束形成器的权重.基于TIMIT语料库的实验结果证明,提出的干扰声源抑制方法的输出信干噪比(SINR)及语音质量感知评价(PESQ)优于参考方法,具有更佳的抗导向矢量失配性能.
Keyword :
最小方差无失真响应 最小方差无失真响应 语音增强 语音增强 聚焦信号子空间 聚焦信号子空间 麦克风阵列 麦克风阵列 波束形成 波束形成
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 周静 , 鲍长春 , 张旭 . 基于聚焦信号子空间估计导向矢量的干扰声源抑制方法 [J]. | 电子学报 , 2023 , 51 (1) : 76-85 . |
MLA | 周静 等. "基于聚焦信号子空间估计导向矢量的干扰声源抑制方法" . | 电子学报 51 . 1 (2023) : 76-85 . |
APA | 周静 , 鲍长春 , 张旭 . 基于聚焦信号子空间估计导向矢量的干扰声源抑制方法 . | 电子学报 , 2023 , 51 (1) , 76-85 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Most of the existing Beamforming methods are based on the assumptions that the sources are all point sources and the angular separation between the direction of arrival (DOA) of the source and the interference is large enough to assure good performance. In this paper, we consider a tough scenario where the target source and the interference are simultaneously spatially distributed and overlapped. To improve the performance of Beamforming in this scenario, we propose two approaches: the first approach exploits the non-Gaussianity as well as the spectrogram sparsity of the output of the microphone array; the second approach exploits the generalized sparsity with overlapped groups of the Beampattern. The proposed criteria are solved by methods based on linearized preconditioned alternating direction method of multipliers (LPADMM) with high accuracy and high computational efficiency. Numerical simulations and real data experiments show the advantages of the proposed approaches compared to previously proposed Beamforming methods for signal enhancement.
Keyword :
Speech enhancement Speech enhancement microphone array microphone array DOA DOA Correlation Correlation speech enhancement speech enhancement Interference Interference Arrays Arrays Array signal processing Array signal processing Signal to noise ratio Signal to noise ratio MVDR MVDR Microphone arrays Microphone arrays
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xiong Wenmeng , Bao Changchun , Jia Maoshen et al. Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources [J]. | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING , 2022 , 30 : 2778-2790 . |
MLA | Xiong Wenmeng et al. "Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources" . | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 30 (2022) : 2778-2790 . |
APA | Xiong Wenmeng , Bao Changchun , Jia Maoshen , Picheral, Jose . Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources . | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING , 2022 , 30 , 2778-2790 . |
Export to | NoteExpress RIS BibTex |
Abstract :
In this paper, a multi-channel speech coding method based on down-mixing and inter-channel amplitude ratio (ICAR) decoding based on generative adversarial network (GAN) is proposed. Firstly, spatial parameter inter-channel time difference (ICTD) is extracted. In the short-time Fourier transform (STFT) domain, the amplitude of the down-mixed mono signal is obtained by adding and averaging the amplitude of the multi-channel speech signals, the phase of the down-mixed mono signal is replaced by the phase of the reference channel, the STFT of the down-mixed mono signal is obtained. Then, the inverse STFT is used to obtain the down-mixed mono signal. The amplitude ratio between multichannel speech signals and down-mixed signal (ICAR) is extracted. The down-mixed mono signal is coded by Speex codec, and ICTD is quantized by a uniform scalar quantizer. The ICAR needn't to be encoded. The ICAR is decoded from a well-trained GAN at the decoder based on the decoded mono signal. Finally, the decoded multi-channel speech signals are recovered by using the decoded down-mixed mono signal, decoded ICTD and the decoded ICAR. The experimental results show that the proposed multi-channel speech coding method can recover multi-channel speech signals with spatial information. © 2021 IEEE.
Keyword :
Signal reconstruction Signal reconstruction Speech communication Speech communication Decoding Decoding Inverse problems Inverse problems Speech coding Speech coding
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhu, Jinru , Bao, Changchun . GAN-Based Inter-Channel Amplitude Ratio Decoding in Multi-Channel Speech Coding [C] . 2021 . |
MLA | Zhu, Jinru et al. "GAN-Based Inter-Channel Amplitude Ratio Decoding in Multi-Channel Speech Coding" . (2021) . |
APA | Zhu, Jinru , Bao, Changchun . GAN-Based Inter-Channel Amplitude Ratio Decoding in Multi-Channel Speech Coding . (2021) . |
Export to | NoteExpress RIS BibTex |
Abstract :
The auto-regressive (AR) model is an effective method to describe the correlation of time series.The classic AR coefficient estimation method utilizes a simple assumption about residual signal.It is a challenge to accurately estimate the auto-regressive coefficients in a complex environment such as noise or interference.Even though Deep Neural Networks (DNN)based AR (DNN-AR) coefficient estimation method can estimate the AR coefficients in a complex environment,the DNN-AR method is easily affected by the numerical stability of Levinson-Durbin recursion (LDR) approach during the training stage.The main target is to improve the stability and overall performance of the DNN-AR based method.In this paper,the precision transform method is utilized to improve computational efficiency while keeping system stability,and the generalized analysis-by-synthesis combing DNN (GABS-DNN) model is proposed for improving the accuracy of AR coefficient estimation and stability of the DNN training in the noisy environment.The GABS-DNN model consists of three main parts:spectrum enhancement network in the modifier,DNN preprocessing and LDR parameter estimation at the encoder,and the conversion from autoregressive coefficient to power spectrum at the decoder.In the process of optimizing the objective function,the error between the enhanced spectrum and the observed spectrum is added for reducing the influence of the gradient of the LDR on the enhanced network during back-propagation,which results in a stable estimation of the AR coefficients of noisy speech. © 2021, Chinese Institute of Electronics. All right reserved.
Keyword :
Backpropagation Backpropagation Deep neural networks Deep neural networks Numerical methods Numerical methods System stability System stability Complex networks Complex networks Computational efficiency Computational efficiency
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Cui, Zi-Hao , Bao, Chang-Chun . Auto-Regressive Coefficient Estimation Based on the GABS and DNN [J]. | Acta Electronica Sinica , 2021 , 49 (1) : 29-39 . |
MLA | Cui, Zi-Hao et al. "Auto-Regressive Coefficient Estimation Based on the GABS and DNN" . | Acta Electronica Sinica 49 . 1 (2021) : 29-39 . |
APA | Cui, Zi-Hao , Bao, Chang-Chun . Auto-Regressive Coefficient Estimation Based on the GABS and DNN . | Acta Electronica Sinica , 2021 , 49 (1) , 29-39 . |
Export to | NoteExpress RIS BibTex |
Abstract :
自回归(AR)模型是一类描述时序序列相关性的有效方法,经典的AR系数估计方法对残差信号做了简单的假设,在噪声干扰等复杂场景中难以准确估计AR系数,而基于深度神经网络(DNN)的AR(DNN-AR)系数估计方法在训练中容易受到莱文逊-杜宾迭代(LDR)解法的数值稳定性的影响.为改善DNN-AR系数训练的稳定性和整体性能,在保证系统稳定性的前提下,本文利用精度转化提高系统运算速度的思路,提出了基于广义合成分析(GABS)模型的深度网络结构改善方法,提高了AR系数在含噪环境下估计的准确性和网络训练的稳定性.组合DNN的GABS(GABS-DNN)的模型由三个主要部分组成:修正器的谱增强网络、编码器的...
Keyword :
深度神经网络 深度神经网络 广义合成分析 广义合成分析 AR系数 AR系数 莱文逊-杜宾迭代解 莱文逊-杜宾迭代解
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 崔子豪 , 鲍长春 . 基于广义合成分析和深度神经网络的自回归系数估计方法 [J]. | 电子学报 , 2021 , 49 (01) : 29-39 . |
MLA | 崔子豪 et al. "基于广义合成分析和深度神经网络的自回归系数估计方法" . | 电子学报 49 . 01 (2021) : 29-39 . |
APA | 崔子豪 , 鲍长春 . 基于广义合成分析和深度神经网络的自回归系数估计方法 . | 电子学报 , 2021 , 49 (01) , 29-39 . |
Export to | NoteExpress RIS BibTex |
Abstract :
基于广义合成分析和深度神经网络的自回归系数估计方法
Keyword :
广义合成分析 广义合成分析 莱文逊-杜宾迭代解 莱文逊-杜宾迭代解 深度神经网络 深度神经网络 AR系数 AR系数
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 崔子豪 , 鲍长春 , 电子学报 . 基于广义合成分析和深度神经网络的自回归系数估计方法 [J]. | 崔子豪 , 2021 , 49 (1) : 29-39 . |
MLA | 崔子豪 et al. "基于广义合成分析和深度神经网络的自回归系数估计方法" . | 崔子豪 49 . 1 (2021) : 29-39 . |
APA | 崔子豪 , 鲍长春 , 电子学报 . 基于广义合成分析和深度神经网络的自回归系数估计方法 . | 崔子豪 , 2021 , 49 (1) , 29-39 . |
Export to | NoteExpress RIS BibTex |
Abstract :
针对基于时频掩蔽的分离方法在多声源场景下的分离效果不佳的问题,论文提出一种利用概率混合模型的理想比率掩蔽多声源分离方法。首先,利用冯·米塞斯分布对时频点处方位角估计进行拟合以及拉普拉斯分布对归一化声压梯度信号向量进行拟合,由此建立概率混合模型。其次,利用期望最大化算法对模型参数进行求解,估计各声源对应的理想比率掩蔽。最后,利用估计出的理想比率掩蔽,从麦克风采集信号中分离得到各声源信号。实验结果表明,与现有基于时频掩蔽的多声源分离方法相比,论文所提方法在欠定场景下具有更好的分离效果。
Keyword :
概率混合模型 概率混合模型 多声源分离 多声源分离 理想比率掩蔽 理想比率掩蔽
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 贾怡恬 , 杨淇善 , 贾懋珅 et al. 利用概率混合模型的理想比率掩蔽多声源分离方法 [J]. | 信号处理 , 2021 , 37 (10) : 1806-1815 . |
MLA | 贾怡恬 et al. "利用概率混合模型的理想比率掩蔽多声源分离方法" . | 信号处理 37 . 10 (2021) : 1806-1815 . |
APA | 贾怡恬 , 杨淇善 , 贾懋珅 , 许文杰 , 鲍长春 . 利用概率混合模型的理想比率掩蔽多声源分离方法 . | 信号处理 , 2021 , 37 (10) , 1806-1815 . |
Export to | NoteExpress RIS BibTex |
Abstract :
实时IP语音通信在数据包会丢失的情况下,语音质量会受到严重影响。为了恢复传输过程中丢失的语音信息,本文提出了一种基于瞬时相位差(Instantaneous Phase Deviation, IPD)和深度神经网络(Deep Neural Network, DNN)的丢包隐藏(Packet Loss Concealment, PLC)方法。在训练阶段,将语音的对数功率谱(Log Power Spectrum, LPS)和IPD作为训练DNN的输入特征,以学习从接收包到丢失包的映射关系;在重构阶段,将丢包前接收到的语音包送入训练好的DNN中,恢复出丢失包的语音。实验结果表明,在不同丢包率下,所提方...
Keyword :
相位特征 相位特征 丢包隐藏 丢包隐藏 深度神经网络 深度神经网络
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 黄晋维 , 鲍长春 . 基于瞬时相位差和深度学习的丢包隐藏方法 [J]. | 信号处理 , 2021 , 37 (10) : 1791-1798 . |
MLA | 黄晋维 et al. "基于瞬时相位差和深度学习的丢包隐藏方法" . | 信号处理 37 . 10 (2021) : 1791-1798 . |
APA | 黄晋维 , 鲍长春 . 基于瞬时相位差和深度学习的丢包隐藏方法 . | 信号处理 , 2021 , 37 (10) , 1791-1798 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Multiple sound source localization is a hot issue of concern in recent years. The Single Source Zone (SSZ) based localization methods achieve good performance due to the detection and utilization of the Time-Frequency (T-F) zone where only one source is dominant. However, some T-F points consisting of components from multiple sources are also included in the detected SSZ sometimes. Once a T-F point in SSZ is contributed by multiple components, this point is defined as an outlier. The existence of outliers within the detected SSZ is usually an unavoidable problem for SSZ-based methods. To solve this problem, a multi-source localization by using offset residual weight is proposed in this paper. In this method, an assumption is developed: the direction estimated by all the T-F points within the detected SSZ has a difference along with the actual direction of sources. But this difference is much smaller than the difference between the directions estimated by the outliers along with the actual source localization. After verifying this assumption experimentally, Point Offset Residual Weight (PORW) and Source Offset Residual Weight (SORW) are proposed to reduce the influence of outliers on the localization results. Then, a composite weight is formed by combining PORW and SORW, which can effectively distinguish the outliers and desired points. After that, the outliers are removed by composite weight. Finally, a statistical histogram of DOA estimation with outliers removed is used for multi-source localization. The objective evaluation of the proposed method is conducted in various simulated environments. The results show that the proposed method achieves a better performance compared with the reference methods in sources localization.
Keyword :
Multiple sound sources localization Multiple sound sources localization Direction of arrival estimation Direction of arrival estimation Soundfield microphone Soundfield microphone Reverberation Reverberation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight [J]. | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) . |
MLA | Jia, Maoshen et al. "Multi-source localization by using offset residual weight" . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2021 . 1 (2021) . |
APA | Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Multiple sound source localization is a hot topic of concern in recent years. In this paper, a multi-source localization method based on weight clustering and outlier removal is proposed to deal with the multiple source localization in the environment with high reverberation time. In this kind of environments, there are always some T-F points consisting of components from multiple sources mixed in the detected spares components. These T-F points, which are called outliers, usually carry the wrong information of localization and could lead to the decline of localization accuracy. To solve this problem, the Point Offset Residual Weight (PORW) and Source Offset Residual Weight (SORW) are introduced to measure the contribution of each T-F point to the localization. The binary clustering is proposed to distinguish and remove the outliers. After that, a statistical histogram of DOA estimation is drawn using the composite weight to weaken the effect of components that interfere with the localization. Finally, the multi-source localization is conducted through peak searching. The objective evaluation of the proposed method is conducted in various simulated environments. The results show that the proposed method achieves a better performance compared with the reference methods in sources localization.
Keyword :
reverberation reverberation direction of arrival estimation direction of arrival estimation multiple sources localization multiple sources localization sound field microphone sound field microphone sparsity sparsity
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Gao, Shang , Jia, Maoshen , Bao, Changchun . A multi-source localization method based on clustering and outlier removal [J]. | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) , 2021 : 950-955 . |
MLA | Gao, Shang et al. "A multi-source localization method based on clustering and outlier removal" . | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) (2021) : 950-955 . |
APA | Gao, Shang , Jia, Maoshen , Bao, Changchun . A multi-source localization method based on clustering and outlier removal . | 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) , 2021 , 950-955 . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |