Details - 北京工业大学机构库

Query：

学者姓名：贾懋珅

Refining：

Year

2025 (1)
2023 (5)
2022 (3)
2021 (4)
2020 (1)
2019 (1)
2018 (9)
2017 (5)
2016 (5)
2014 (3)
2012 (1)
2011 (3)

Submit Unfold

Type

期刊论文 (26)
会议论文 (15)

Submit Unfold

Indexed by

Scopus (26)
EI (23)
SCIE (23)
CPCI-S (15)
PubMed (1)

Submit Unfold

Source

APPLIED ACOUSTICS (4)
APPLIED SCIENCES-BASEL (3)
CIRCUITS SYSTEMS AND SIGNAL PROCESSING (3)
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING (3)
IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) (2)
10th International Symposium on Chinese Spoken Language Processing (ISCSLP) (1)
11th IEEE International Symposium on Signal Processing and Information Technology, ISSPIT 2011 (1)
13th IEEE International Conference on Signal Processing (ICSP) (1)
15th International Conference on Intelligent Computing, ICIC 2019 (1)
2021 IMMERSIVE AND 3D AUDIO: FROM ARCHITECTURE TO AUTOMOTIVE (I3DA) (1)
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC (1)
21st IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (1)
2nd IEEE China Summit / International Conference on Signal and Information Processing (IEEE ChinaSIP) (1)
3rd International Conference on Wireless Communication and Sensor Networks (WCSN) (1)
5th International Conference on Audio, Language and Image Processing (ICALIP) (1)
9th Annual Summit and Conference of the Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA ASC) (1)
Annual Summit and Conference of Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA) (1)
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (1)
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY (1)
CHINESE JOURNAL OF ELECTRONICS (1)
ELECTRONICS (1)
ELECTRONICS LETTERS (1)
IEEE 11th International Conference on Signal Processing (ICSP) (1)
IEEE ACCESS (1)
IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) (1)
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (1)
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS (1)
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (1)
Journal of Information Hiding and Multimedia Signal Processing (1)
SENSORS (1)
SPEECH COMMUNICATION (1)

Submit Unfold

Complex

First Author (16)
Reprint Author (18)
First Comm (18)
Reprint Comm (18)

Submit Unfold

Co-Author

Bao, Changchun (13)
Sun, Jundai (10)
Bao, Chang-chun (9)
Wang, Jing (7)
Gao, Shang (5)
Bao, Feng (4)
Deng, Feng (4)
Zhang, Jiaming (4)
Bu, Bing (3)
Li, Lu (3)
Ritz, Christian (3)
Wu, Yuxuan (3)
Zhang, Yu (3)
Bao, Chang-Chun (2)
Dou, Hui-jing (2)
Jia, Xinyu (2)
Liu, Xin (2)
Li, Xiao-ming (2)
Pai, Tun-Wen (2)
Ru, Jiawei (2)
Sun, Junyue (2)
Tao, Liang (2)
Zheng, Xiguang (2)
Bao Changchun (1)
Cao, Xuan (1)
Gao, Zhen-zhen (1)
Jia, Yitian (1)
Li, Ru-wei (1)
Li, Ru-Wei (1)
Li, Tianhao (1)
Liu, Jinxiang (1)
Li, Xiao-Ming (1)
Pai, Tun-wen (1)
Pan, Jian-hong (1)
Song, Boxuan (1)
Sun Jundai (1)
Wang, Chunxi (1)
Wang, Lizhong (1)
Wang, Qi (1)
Wang, Shusen (1)
Wang Wenbei (1)
Wang, Wenbei (1)
Wang, Wen-Bei (1)
Wen, Liang (1)
Xiang, Yang (1)
Yao, Dingding (1)
Zhang, Xinfeng (1)
Zhao, Yuhao (1)

Submit Unfold

Language

English (41)

Submit

Clean All

Select All Sort by：

Default

Default
Title
Year
WOS Cited Count
Impact factor
Ascending
Descending

< Page ，Total 5 >

TFF-Codec: A High Fidelity End-to-End Neural Audio Codec SCIE

期刊论文 | 2025 | CIRCUITS SYSTEMS AND SIGNAL PROCESSING

Zhao, Yuhao | Jia, Maoshen | Ru, Jiawei | Wang, Lizhong | Wen, Liang

Abstract&Keyword Cite

Abstract ：

Audio Coding has made significant progress with the development of deep neural networks. Recently, neural speech codecs based on vector quantized variational autoencoder have become increasingly popular among researchers due to their elegant design and superior performance, but their application to high bitrate audio coding has not been further expanded. In this paper, we propose a novel high fidelity end-to-end neural audio codec called time frequency fusion codec (TFF-Codec), which is capable of high-quality reconstruction of 32 kHz audio in the time-frequency domain at 48 and 64 kbps. In this paper, a dual-path time-frequency filtering module is proposed to capture the local structure of the spectrogram and the long-term time dependence between consecutive frames. The architecture of the proposed codec is composed of encoder, the time-frequency filtering module, vector quantizer and decoder. First, the input audio is fed into the encoder to obtain its potential representation. Then, it is modeled in the frequency domain in the time-frequency filtering module. Subsequently, it is further compressed by a vector quantizer. Finally, the reconstructed audio is obtained by the decoder. We also use a combination of multiple loss functions in TFF-Codec to ensure that the reconstructed audio is balanced in terms of objective metrics and subjective listening experience. To evaluate the performance of TFF-Codec, comparative experiments are conducted with the traditional audio codec Opus and several recent neural audio codecs. Both subjective and objective evaluation tests demonstrate the superiority of our proposed method.

Keyword ：

Audio codec Audio codec End to end neural network End to end neural network High fidelity audio generation High fidelity audio generation Auto encoder Auto encoder

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Zhao, Yuhao , Jia, Maoshen , Ru, Jiawei et al. TFF-Codec: A High Fidelity End-to-End Neural Audio Codec [J]. \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2025 .
MLA	Zhao, Yuhao et al. "TFF-Codec: A High Fidelity End-to-End Neural Audio Codec" . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING (2025) .
APA	Zhao, Yuhao , Jia, Maoshen , Ru, Jiawei , Wang, Lizhong , Wen, Liang . TFF-Codec: A High Fidelity End-to-End Neural Audio Codec . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2025 .
Export to	NoteExpress RIS BibTex

Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments SCIE

期刊论文 | 2023 , 2023 (1) | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Wang, Chunxi | Jia, Maoshen | Zhang, Xinfeng

WoS CC Cited Count： 2

Abstract&Keyword Cite

Abstract ：

In recent years, the speaker-independent, single-channel speech separation problem has made significant progress with the development of deep neural networks (DNNs). However, separating the speech of each interested speaker from an environment that includes the speech of other speakers, background noise, and room reverberation remains challenging. In order to solve this problem, a speech separation method for a noisy reverberation environment is proposed. Firstly, the time-domain end-to-end network structure of a deep encoder/decoder dual-path neural network is introduced in this paper for speech separation. Secondly, to make the model not fall into local optimum during training, a loss function stretched optimal scale-invariant signal-to-noise ratio (SOSISNR) was proposed, inspired by the scale-invariant signal-to-noise ratio (SISNR). At the same time, in order to make the training more appropriate to the human auditory system, the joint loss function is extended based on short-time objective intelligibility (STOI). Thirdly, an alignment operation is proposed to reduce the influence of time delay caused by reverberation on separation performance. Combining the above methods, the subjective and objective evaluation metrics show that this study has better separation performance in complex sound field environments compared to the baseline methods.

Keyword ：

Speech enhancement Speech enhancement Deep learning Deep learning Speech separation Speech separation SISNR SISNR

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Wang, Chunxi , Jia, Maoshen , Zhang, Xinfeng . Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments [J]. \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2023 , 2023 (1) .
MLA	Wang, Chunxi et al. "Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments" . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2023 . 1 (2023) .
APA	Wang, Chunxi , Jia, Maoshen , Zhang, Xinfeng . Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2023 , 2023 (1) .
Export to	NoteExpress RIS BibTex

Multisource localization based on angle distribution of time-frequency points using an FOA microphone SCIE

期刊论文 | 2023 , 8 (3) , 807-823 | CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY

Tao, Liang | Jia, Maoshen | Li, Lu | Wang, Jing | Xiang, Yang

WoS CC Cited Count： 1

Abstract&Keyword Cite

Abstract ：

Multisource localization occupies an important position in the field of acoustic signal processing and is widely applied in scenarios, such as human-machine interaction and spatial acoustic parameter acquisition. The direction-of-arrival (DOA) of a sound source is convenient to render spatial sound in the audio metaverse. A multisource localization method in a reverberation environment is proposed based on the angle distribution of time-frequency (TF) points using a first-order ambisonics (FOA) microphone. The method is implemented in three steps. 1) By exploring the angle distribution of TF points, a single-source zone (SSZ) detection method is proposed by using a standard deviation-based measure, which reveals the degree of convergence of TF point angles in a zone. 2) To reduce the effect of outliers on localization, an outlier removal method is designed to remove the TF points whose angles are far from the real DOAs, where the median angle of each detected zone is adopted to construct the outlier set. 3) DOA estimates of multiple sources are obtained by postprocessing of the angle histogram. Experimental results in both the simulated and real scenarios verify the effectiveness of the proposed method in a reverberation environment, which also show that the proposed method outperforms reference methods.

Keyword ：

speech processing speech processing signal processing signal processing

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Tao, Liang , Jia, Maoshen , Li, Lu et al. Multisource localization based on angle distribution of time-frequency points using an FOA microphone [J]. \| CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY , 2023 , 8 (3) : 807-823 .
MLA	Tao, Liang et al. "Multisource localization based on angle distribution of time-frequency points using an FOA microphone" . \| CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 8 . 3 (2023) : 807-823 .
APA	Tao, Liang , Jia, Maoshen , Li, Lu , Wang, Jing , Xiang, Yang . Multisource localization based on angle distribution of time-frequency points using an FOA microphone . \| CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY , 2023 , 8 (3) , 807-823 .
Export to	NoteExpress RIS BibTex

Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments SCIE

期刊论文 | 2023 , 42 (8) , 4713-4739 | CIRCUITS SYSTEMS AND SIGNAL PROCESSING

Zhang, Yu | Jia, Maoshen | Gao, Shang | Wang, Jing

Abstract&Keyword Cite

Abstract ：

This paper proposes a diffuseness estimation-based single-source time-frequency point (SSTP) detection method for multisource direction of arrival (DOA) estimation. According to the composition, time-frequency (TF) points are divided into three types: SSTP, multisource TF, and interference TF. SSTPs and multisource TF points are defined as weak interference time-frequency points (WITPs). An SSTP is a TF point consisting only of the direct component of one sound source, which is beneficial for DOA estimation. Therefore, multisource DOA estimation is transformed into single-source DOA estimation by SSTP detection. Diffuseness estimation is introduced for a sound field microphone array. WITPs are detected by a diffuseness estimation-based detection method. Phase similarity determination is adopted to identify SSTPs from detected WITPs. Multiple sound source localization is completed by searching peaks in the normalized histogram of DOA estimates corresponding to the detected SSTPs. Experiments demonstrate that the proposed method achieves the precise detection of SSTPs, and evaluations show that it has superior accuracy of multiple sound source counting and localization in reverberant and noisy environments.

Keyword ：

Sparsity component analysis Sparsity component analysis Reverberation Reverberation Diffuseness estimation Diffuseness estimation Direction of arrival Direction of arrival

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Zhang, Yu , Jia, Maoshen , Gao, Shang et al. Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments [J]. \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2023 , 42 (8) : 4713-4739 .
MLA	Zhang, Yu et al. "Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments" . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING 42 . 8 (2023) : 4713-4739 .
APA	Zhang, Yu , Jia, Maoshen , Gao, Shang , Wang, Jing . Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2023 , 42 (8) , 4713-4739 .
Export to	NoteExpress RIS BibTex

Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement SCIE

期刊论文 | 2023 , 42 (10) , 6001-6028 | CIRCUITS SYSTEMS AND SIGNAL PROCESSING

Li, Lu | Jia, Maoshen | Liu, Jinxiang | Pai, Tun-Wen

Abstract&Keyword Cite

Abstract ：

Multiple speech source separation plays an important role in many applications such as automatic speech recognition, acoustical surveillance, and teleconferencing. In this study, we propose a method for the separation of multiple speech sources in a reverberant environment based on sparse component enhancement. In a recorded signal (i.e., a mixed signal of multiple speech sources), there are always time-frequency points where only one source is active or dominant. It is the sparsity of speech signals. Such time-frequency points are called sparse component points. However, in a reverberant environment, the sparsity of the speech signal is affected, resulting in a decrease in the number of sparse component points in the recorded signal, which affects the quality of the separated source signal. In this study, for mixture signals recorded by a soundfield microphone (a microphone array), we first experimentally analyze the negative impact of reverberation on sparse components and then develop a sparse component enhancement method to increase the number of these points. Then, the sparse components are identified and classified according to the directions of arrival estimate of the sources. Next, the sparse components are used to guide the recovery of the non-sparse components. Finally, multiple source separation is achieved by the joint restoration of the sparse and non-sparse components of each source. The proposed method has low computational complexity and applies to underdetermined scenarios. Through a series of subjective and objective evaluation experiments, the effectiveness of the method is verified.

Keyword ：

Multiple source separation Multiple source separation Sparse component Sparse component Reverberation Reverberation Soundfield microphone Soundfield microphone

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Lu , Jia, Maoshen , Liu, Jinxiang et al. Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement [J]. \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2023 , 42 (10) : 6001-6028 .
MLA	Li, Lu et al. "Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement" . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING 42 . 10 (2023) : 6001-6028 .
APA	Li, Lu , Jia, Maoshen , Liu, Jinxiang , Pai, Tun-Wen . Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement . \| CIRCUITS SYSTEMS AND SIGNAL PROCESSING , 2023 , 42 (10) , 6001-6028 .
Export to	NoteExpress RIS BibTex

Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization CPCI-S

期刊论文 | 2023 , 996-1001 | 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC

Tao, Liang | Jia, Maoshen | Bu, Bing | Yao, Dingding

Abstract&Keyword Cite

Abstract ：

Estimating the direction of arrival (DOA) is an important topic in the array signal processing. This paper addresses the issue of multisource localization in a closed environment. We propose a single source zone (SSZ) detection method based on first-order relative harmonic coefficient (RHC), and designs a dynamic SSZ detection rule. Finally, 2-D kernel density estimation (KDE) and peak search are used to achieve multisource DOA estimation. The proposed method is evaluated by simulation experiment and compared with the reference methods, and the effectiveness of the proposed method is verified.

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Tao, Liang , Jia, Maoshen , Bu, Bing et al. Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization [J]. \| 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC , 2023 : 996-1001 .
MLA	Tao, Liang et al. "Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization" . \| 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC (2023) : 996-1001 .
APA	Tao, Liang , Jia, Maoshen , Bu, Bing , Yao, Dingding . Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization . \| 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC , 2023 , 996-1001 .
Export to	NoteExpress RIS BibTex

DOA estimation of multiple speech sources based on the single-source point detection using an FOA microphone SCIE

期刊论文 | 2022 , 195 | APPLIED ACOUSTICS

Li, Lu | Jia, Maoshen | Wang, Jing

WoS CC Cited Count： 11

Abstract&Keyword Cite

Abstract ：

This paper presents a method for direction of arrival (DOA) estimation of multiple speech sources based on the temporal correlation and local-frequency stationarity of speech signals. The distribution analysis of single-source points (SSPs) in a recorded signal shows that in the time-frequency (T-F) domain, the SSPs are distributed in the form of a small cluster. According to this distribution, a method for DOA estimation of multiple sound sources is developed based on the continuity between adjacent T-F points. In addition, low-reverberation single-source (LRSS) points are detected based on the phase consistency and used as guidance to detect whether adjacent T-F points are SSPs. The direction deviations between adjacent frequency points and between adjacent frames are used as the SSP detection criteria considering the temporal correlation and local-frequency stationarity. The kernel density estimation and peak search are performed to obtain the dynamic DOA estimation range of each source. Finally, DOA estimates of each source are obtained by statistical weighting-based fine localization. Experiments under both simulated and real conditions show that the proposed method can achieve better localization performance than several existing methods. (c) 2022 Elsevier Ltd. All rights reserved.

Keyword ：

Direction of arrival estimation Direction of arrival estimation Single-source point detection Single-source point detection Temporal correlation Temporal correlation

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Lu , Jia, Maoshen , Wang, Jing . DOA estimation of multiple speech sources based on the single-source point detection using an FOA microphone [J]. \| APPLIED ACOUSTICS , 2022 , 195 .
MLA	Li, Lu et al. "DOA estimation of multiple speech sources based on the single-source point detection using an FOA microphone" . \| APPLIED ACOUSTICS 195 (2022) .
APA	Li, Lu , Jia, Maoshen , Wang, Jing . DOA estimation of multiple speech sources based on the single-source point detection using an FOA microphone . \| APPLIED ACOUSTICS , 2022 , 195 .
Export to	NoteExpress RIS BibTex

A Multi-Source Separation Approach Based on DOA Cue and DNN SCIE

期刊论文 | 2022 , 12 (12) | APPLIED SCIENCES-BASEL

Zhang, Yu | Jia, Maoshen | Jia, Xinyu | Pai, Tun-Wen

Abstract&Keyword Cite

Abstract ：

Multiple sound source separation in a reverberant environment has become popular in recent years. To improve the quality of the separated signal in a reverberant environment, a separation method based on a DOA cue and a deep neural network (DNN) is proposed in this paper. Firstly, a pre-processing model based on non-negative matrix factorization (NMF) is utilized for recorded signal dereverberation, which makes source separation more efficient. Then, we propose a multi-source separation algorithm combining sparse and non-sparse component points recovery to obtain each sound source signal from the dereverberated signal. For sparse component points, the dominant sound source for each sparse component point is determined by a DOA cue. For non-sparse component points, a DNN is used to recover each sound source signal. Finally, the signals separated from the sparse and non-sparse component points are well matched by temporal correlation to obtain each sound source signal. Both objective and subjective evaluation results indicate that compared with the existing method, the proposed separation approach shows a better performance in the case of a high-reverberation environment.

Keyword ：

direction of arrival direction of arrival multi-source separation multi-source separation deep neural network deep neural network dereverberation dereverberation

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Zhang, Yu , Jia, Maoshen , Jia, Xinyu et al. A Multi-Source Separation Approach Based on DOA Cue and DNN [J]. \| APPLIED SCIENCES-BASEL , 2022 , 12 (12) .
MLA	Zhang, Yu et al. "A Multi-Source Separation Approach Based on DOA Cue and DNN" . \| APPLIED SCIENCES-BASEL 12 . 12 (2022) .
APA	Zhang, Yu , Jia, Maoshen , Jia, Xinyu , Pai, Tun-Wen . A Multi-Source Separation Approach Based on DOA Cue and DNN . \| APPLIED SCIENCES-BASEL , 2022 , 12 (12) .
Export to	NoteExpress RIS BibTex

Cross-corpus speech emotion recognition using subspace learning and domain adaption SCIE

期刊论文 | 2022 , 2022 (1) | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Cao, Xuan | Jia, Maoshen | Ru, Jiawei | Pai, Tun-wen

WoS CC Cited Count： 4

Abstract&Keyword Cite

Abstract ：

Speech emotion recognition (SER) is a hot topic in speech signal processing. When the training data and the test data come from different corpus, their feature distributions are different, which leads to the degradation of the recognition performance. Therefore, in order to solve this problem, a cross-corpus speech emotion recognition method is proposed based on subspace learning and domain adaptation in this paper. Specifically, training set data and the test set data are used to form the source domain and target domain, respectively. Then, the Hessian matrix is introduced to obtain the subspace for the extracted features in both source and target domains. In addition, an information entropy-based domain adaption method is introduced to construct the common space. In the common space, the difference between the feature distributions in the source domain and target domain is reduced as much as possible. To evaluate the performance of the proposed method, extensive experiments are conducted on cross-corpus speech emotion recognition. Experimental results show that the proposed method achieves better performance compared with some existing subspace learning and domain adaptation methods.

Keyword ：

Cross-corpus Cross-corpus Domain adaption Domain adaption Subspace learning Subspace learning Speech emotion recognition Speech emotion recognition

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Cao, Xuan , Jia, Maoshen , Ru, Jiawei et al. Cross-corpus speech emotion recognition using subspace learning and domain adaption [J]. \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2022 , 2022 (1) .
MLA	Cao, Xuan et al. "Cross-corpus speech emotion recognition using subspace learning and domain adaption" . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2022 . 1 (2022) .
APA	Cao, Xuan , Jia, Maoshen , Ru, Jiawei , Pai, Tun-wen . Cross-corpus speech emotion recognition using subspace learning and domain adaption . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2022 , 2022 (1) .
Export to	NoteExpress RIS BibTex

Multi-source localization by using offset residual weight SCIE

期刊论文 | 2021 , 2021 (1) | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Jia, Maoshen | Gao, Shang | Bao, Changchun

WoS CC Cited Count： 2

Abstract&Keyword Cite

Abstract ：

Multiple sound source localization is a hot issue of concern in recent years. The Single Source Zone (SSZ) based localization methods achieve good performance due to the detection and utilization of the Time-Frequency (T-F) zone where only one source is dominant. However, some T-F points consisting of components from multiple sources are also included in the detected SSZ sometimes. Once a T-F point in SSZ is contributed by multiple components, this point is defined as an outlier. The existence of outliers within the detected SSZ is usually an unavoidable problem for SSZ-based methods. To solve this problem, a multi-source localization by using offset residual weight is proposed in this paper. In this method, an assumption is developed: the direction estimated by all the T-F points within the detected SSZ has a difference along with the actual direction of sources. But this difference is much smaller than the difference between the directions estimated by the outliers along with the actual source localization. After verifying this assumption experimentally, Point Offset Residual Weight (PORW) and Source Offset Residual Weight (SORW) are proposed to reduce the influence of outliers on the localization results. Then, a composite weight is formed by combining PORW and SORW, which can effectively distinguish the outliers and desired points. After that, the outliers are removed by composite weight. Finally, a statistical histogram of DOA estimation with outliers removed is used for multi-source localization. The objective evaluation of the proposed method is conducted in various simulated environments. The results show that the proposed method achieves a better performance compared with the reference methods in sources localization.

Keyword ：

Multiple sound sources localization Multiple sound sources localization Direction of arrival estimation Direction of arrival estimation Soundfield microphone Soundfield microphone Reverberation Reverberation

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight [J]. \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) .
MLA	Jia, Maoshen et al. "Multi-source localization by using offset residual weight" . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2021 . 1 (2021) .
APA	Jia, Maoshen , Gao, Shang , Bao, Changchun . Multi-source localization by using offset residual weight . \| EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) .
Export to	NoteExpress RIS BibTex

10| 20| 50 per page

< Page ，Total 5 >

Type
Departments

All Years Choose Year From to