• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:张菁

Refining:

Source

Submit Unfold

Co-Author

Submit Unfold

Language

Submit

Clean All

Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 14 >
Spatial-specific Transformer with involution for semantic segmentation of high-resolution remote sensing images SCIE
期刊论文 | 2023 , 44 (4) , 1280-1307 | INTERNATIONAL JOURNAL OF REMOTE SENSING
Abstract&Keyword Cite

Abstract :

High-resolution remote sensing images (HR-RSIs) have a strong dependency between geospatial objects and background. Considering the complex spatial structure and multiscale objects in HR-RSIs, how to fully mine spatial information directly determines the quality of semantic segmentation. In this paper, we focus on the Spatial-specific Transformer with involution for semantic segmentation of HR-RSIs. First, we integrate the spatial-specific involution branch with self-attention branch to form a Spatial-specific Transformer backbone to produce multilevel features with global and spatial information without additional parameters. Then, we introduce multiscale feature representation with large window attention into Swin Transformer to capture multiscale contextual information. Finally, we add a geospatial feature supplement branch in the semantic segmentation decoder to mitigate the loss of semantic information caused by down-sampling multiscale features of geospatial objects. Experimental results demonstrate that our method can achieve a competitive semantic segmentation performance of 87.61% and 80.08% mIoU on Potsdam and Vaihingen datasets, respectively.

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wu, Xinjia , Zhang, Jing , Li, Wensheng et al. Spatial-specific Transformer with involution for semantic segmentation of high-resolution remote sensing images [J]. | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2023 , 44 (4) : 1280-1307 .
MLA Wu, Xinjia et al. "Spatial-specific Transformer with involution for semantic segmentation of high-resolution remote sensing images" . | INTERNATIONAL JOURNAL OF REMOTE SENSING 44 . 4 (2023) : 1280-1307 .
APA Wu, Xinjia , Zhang, Jing , Li, Wensheng , Li, Jiafeng , Zhuo, Li , Zhang, Jie . Spatial-specific Transformer with involution for semantic segmentation of high-resolution remote sensing images . | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2023 , 44 (4) , 1280-1307 .
Export to NoteExpress RIS BibTex
Short video fingerprint extraction: from audio-visual fingerprint fusion to multi-index hashing SCIE
期刊论文 | 2022 , 29 (3) , 981-1000 | MULTIMEDIA SYSTEMS
Abstract&Keyword Cite

Abstract :

As one of the most prevalent we-media, short video has exponentially grown and gradually fallen into the disaster area of infringement. Video fingerprint extraction technology is conducive to the intelligent identification of short video. In view of various tampering attacks, a short video fingerprint extraction method from audio-visual fingerprint fusion to multi-index hashing is proposed, including: (1) the shot-level fingerprint of short video is extracted by audio-visual fingerprint fusion after analyzing the consistency to eliminate the uncertainty at the decision-making layer, in which the visual fingerprint is generated by R(2 + 1)D network, and the audio fingerprint is combined by extracting audio features with masked audio spectral keypoints (MASK) and convolutional recurrent neural network (CRNN); (2) the shot-level fingerprints are assembled into the data-level fingerprint of short video by constructing the data-shot-key frame relationship model of data structure; (3) the short video fingerprint is matched by measuring the weighted Hamming distance by creating the multi-index hashing of the data-level fingerprint. Five experiments are conducted on the CC_Web_Video dataset and the Moments_in_Time_Raw_v2 dataset, and the results show that our method can effectively raise the overall performance of short video fingerprint.

Keyword :

Multi-index hashing Multi-index hashing Fingerprint extraction Fingerprint extraction Data-shot-key frame Data-shot-key frame Short video Short video Audio-visual Audio-visual

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhang, Shuying , Zhang, Jing , Wang, Yizhou et al. Short video fingerprint extraction: from audio-visual fingerprint fusion to multi-index hashing [J]. | MULTIMEDIA SYSTEMS , 2022 , 29 (3) : 981-1000 .
MLA Zhang, Shuying et al. "Short video fingerprint extraction: from audio-visual fingerprint fusion to multi-index hashing" . | MULTIMEDIA SYSTEMS 29 . 3 (2022) : 981-1000 .
APA Zhang, Shuying , Zhang, Jing , Wang, Yizhou , Zhuo, Li . Short video fingerprint extraction: from audio-visual fingerprint fusion to multi-index hashing . | MULTIMEDIA SYSTEMS , 2022 , 29 (3) , 981-1000 .
Export to NoteExpress RIS BibTex
Streamer temporal action detection in live video by co-attention boundary matching SCIE
期刊论文 | 2022 , 13 (10) , 3071-3088 | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
WoS CC Cited Count: 2
Abstract&Keyword Cite

Abstract :

With the advent of the we-media era, live video is being sought after by more and more web users. How to effectively identify and supervise the streamer activities in the live video is of great significance to promote the high-quality development of the live video industry. The streamer activity can be characterized by the temporal composition of a series of actions. To improve the accuracy of streamer temporal action detection, it is a promising path to utilize the temporal action location and co-attention mechanism to overcome the problem of blurring action boundary. Therefore, a streamer temporal action detection method by co-attention boundary matching in live video is proposed. (1) The global spatiotemporal features and action template features of live video are extracted by using two-stream convolutional network and action spatiotemporal attention network respectively. (2) The probability sequences are generated from the global spatiotemporal features through temporal action evaluation, and the boundary matching confidence maps are produced by confidence evaluation of global spatiotemporal features and action template features under the co-attention mechanism. (3) The streamer temporal actions are detected based on the action proposals generated by probability sequences and boundary matching maps. We establish a real-world streamer action BJUT-SAD dataset and conduct extensive experiments to verify that our method can boost the accuracy of streamer temporal action detection in live video. In particular, our temporal action proposal generation and streamer action detection task produce competitive results to prior methods, demonstrating the effectiveness of our method.

Keyword :

Boundary matching Boundary matching Co-attention Co-attention Streamer Streamer Live video Live video Temporal action detection Temporal action detection

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Chenhao , He, Chen , Zhang, Hui et al. Streamer temporal action detection in live video by co-attention boundary matching [J]. | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2022 , 13 (10) : 3071-3088 .
MLA Li, Chenhao et al. "Streamer temporal action detection in live video by co-attention boundary matching" . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 13 . 10 (2022) : 3071-3088 .
APA Li, Chenhao , He, Chen , Zhang, Hui , Yao, Jiacheng , Zhang, Jing , Zhuo, Li . Streamer temporal action detection in live video by co-attention boundary matching . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2022 , 13 (10) , 3071-3088 .
Export to NoteExpress RIS BibTex
Meta-Learning Paradigm and CosAttn for Streamer Action Recognition in Live Video SCIE
期刊论文 | 2022 , 29 , 1097-1101 | IEEE SIGNAL PROCESSING LETTERS
Abstract&Keyword Cite

Abstract :

As an emerging field of network content production, live video has been in the vacuum zone of cyberspace governance for a long time. Streamer action recognition is conducive to the supervision of live video content. In view of the diversity and imbalance of streamer actions, it is attractive to introduce few-shot learning to realize streamer action recognition. Therefore, a meta-learning paradigm and CosAttn for streamer action recognition method in live video is proposed, including: (1) the training set samples similar to the streamer action to be recognized are pretrained to improve the backbone network; (2) video-level features are extracted by R(2+1)D-18 backbone and global average pooling in the meta-learning paradigm; (3) the streamer action is recognized by calculating cosine similarity after sending the video-level features to CosAttn to generate a streamer action category prototype. Experimental results on several real-world action recognition datasets demonstrate the effectiveness of our method.

Keyword :

Prototypes Prototypes meta-learning paradigm meta-learning paradigm CosAttn CosAttn Optimization Optimization Streaming media Streaming media Training Training Feature extraction Feature extraction Testing Testing Live video Live video streamer action recognition streamer action recognition Task analysis Task analysis few-shot learning few-shot learning

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 He, Chen , Zhang, Jing , Yao, Jiacheng et al. Meta-Learning Paradigm and CosAttn for Streamer Action Recognition in Live Video [J]. | IEEE SIGNAL PROCESSING LETTERS , 2022 , 29 : 1097-1101 .
MLA He, Chen et al. "Meta-Learning Paradigm and CosAttn for Streamer Action Recognition in Live Video" . | IEEE SIGNAL PROCESSING LETTERS 29 (2022) : 1097-1101 .
APA He, Chen , Zhang, Jing , Yao, Jiacheng , Zhuo, Li , Tian, Qi . Meta-Learning Paradigm and CosAttn for Streamer Action Recognition in Live Video . | IEEE SIGNAL PROCESSING LETTERS , 2022 , 29 , 1097-1101 .
Export to NoteExpress RIS BibTex
A Hierarchical Scheme for Video-Based Person Re-identification Using Lightweight PCANet and Handcrafted LOMO Features SCIE CSCD
期刊论文 | 2021 , 30 (2) , 289-295 | CHINESE JOURNAL OF ELECTRONICS
Abstract&Keyword Cite

Abstract :

A two-level hierarchical scheme for video-based person re-identification (re-id) is presented, with the aim of learning a pedestrian appearance model through more complete walking cycle extraction. Specifically, given a video with consecutive frames, the objective of the first level is to detect the key frame with lightweight Convolutional neural network (CNN) of PCANet to reflect the summary of the video content. At the second level, on the basis of the detected key frame, the pedestrian walking cycle is extracted from the long video sequence. Moreover, local features of Local maximal occurrence (LOMO) of the walking cycle are extracted to represent the pedestrian' s appearance information. In contrast to the existing walking-cycle-based person re-id approaches, the proposed scheme relaxes the limit on step number for a walking cycle, thus making it flexible and less affected by noisy frames. Experiments are conducted on two benchmark datasets: PRID 2011 and iLIDS-VID. The experimental results demonstrate that our proposed scheme outperforms the six state-of-art video-based re-id methods, and is more robust to the severe video noises and variations in pose, lighting, and camera viewpoint.

Keyword :

Video&#8208 Video&#8208 identification identification based person re&#8208 based person re&#8208 Convolutional neural network Convolutional neural network Walking cycle extraction Walking cycle extraction Key frame detection Key frame detection

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Youjiao, Li , Li, Zhuo , Jiafeng, Li et al. A Hierarchical Scheme for Video-Based Person Re-identification Using Lightweight PCANet and Handcrafted LOMO Features [J]. | CHINESE JOURNAL OF ELECTRONICS , 2021 , 30 (2) : 289-295 .
MLA Youjiao, Li et al. "A Hierarchical Scheme for Video-Based Person Re-identification Using Lightweight PCANet and Handcrafted LOMO Features" . | CHINESE JOURNAL OF ELECTRONICS 30 . 2 (2021) : 289-295 .
APA Youjiao, Li , Li, Zhuo , Jiafeng, Li , Jing, Zhang . A Hierarchical Scheme for Video-Based Person Re-identification Using Lightweight PCANet and Handcrafted LOMO Features . | CHINESE JOURNAL OF ELECTRONICS , 2021 , 30 (2) , 289-295 .
Export to NoteExpress RIS BibTex
Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit SCIE
期刊论文 | 2021 , 2021 (1) | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING
Abstract&Keyword Cite

Abstract :

With the sharp booming of online live streaming platforms, some anchors seek profits and accumulate popularity by mixing inappropriate content into live programs. After being blacklisted, these anchors even forged their identities to change the platform to continue live, causing great harm to the network environment. Therefore, we propose an anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit (GRU) for anchor identification of live platform. First, the speech of the anchor is extracted from the live streaming by using voice activation detection (VAD) and speech separation. Then, the feature sequence of anchor voiceprint is generated from the speech waveform with the self-attention network RawNet-SA. Finally, the feature sequence of anchor voiceprint is aggregated by GRU to transform into a deep voiceprint feature vector for anchor recognition. Experiments are conducted on the VoxCeleb, CN-Celeb, and MUSAN dataset, and the competitive results demonstrate that our method can effectively recognize the anchor voiceprint in video streaming.

Keyword :

Live streaming Live streaming Voiceprint recognition Voiceprint recognition RawNet-SA RawNet-SA Anchor Anchor GRU GRU

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Yao, Jiacheng , Zhang, Jing , Li, Jiafeng et al. Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit [J]. | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) .
MLA Yao, Jiacheng et al. "Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit" . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING 2021 . 1 (2021) .
APA Yao, Jiacheng , Zhang, Jing , Li, Jiafeng , Zhuo, Li . Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit . | EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING , 2021 , 2021 (1) .
Export to NoteExpress RIS BibTex
Crowd activity recognition in live video streaming via 3D-ResNet and region graph convolution network SCIE
期刊论文 | 2021 , 15 (14) , 3476-3486 | IET IMAGE PROCESSING
Abstract&Keyword Cite

Abstract :

Since the era of we-media, live video industry has shown an explosive growth trend. For large-scale live video streaming, especially those containing crowd events that may cause great social impact, how to identify and supervise the crowd activity in live video streaming effectively is of great value to push the healthy development of live video industry. The existing crowd activity recognition mainly uses visual information, rarely fully exploiting and utilizing the correlation or external knowledge between crowd content. Therefore, a crowd activity recognition method in live video streaming is proposed by 3D-ResNet and regional graph convolution network (ReGCN). (1) After extracting deep spatiotemporal features from live video streaming with 3D-ResNet, the region proposals are generated by region proposal network. (2) A weakly supervised ReGCN is constructed by making region proposals as graph nodes and their correlations as edges. (3) Crowd activity in live video streaming is recognised by combining the output of ReGCN, the deep spatiotemporal features and the crowd motion intensity as external knowledge. Four experiments are conducted on the public collective activity extended dataset and a real-world dataset BJUT-CAD. The competitive results demonstrate that our method can effectively recognise crowd activity in live video streaming.

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Kang, Junpeng , Zhang, Jing , Li, Wensheng et al. Crowd activity recognition in live video streaming via 3D-ResNet and region graph convolution network [J]. | IET IMAGE PROCESSING , 2021 , 15 (14) : 3476-3486 .
MLA Kang, Junpeng et al. "Crowd activity recognition in live video streaming via 3D-ResNet and region graph convolution network" . | IET IMAGE PROCESSING 15 . 14 (2021) : 3476-3486 .
APA Kang, Junpeng , Zhang, Jing , Li, Wensheng , Zhuo, Li . Crowd activity recognition in live video streaming via 3D-ResNet and region graph convolution network . | IET IMAGE PROCESSING , 2021 , 15 (14) , 3476-3486 .
Export to NoteExpress RIS BibTex
Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning SCIE
期刊论文 | 2021 , 453 , 383-392 | NEUROCOMPUTING
WoS CC Cited Count: 1
Abstract&Keyword Cite

Abstract :

Live video hosted by streamer is being sought after by more and more Internet users. A few streamers show inappropriate action in normal live video content for profit and popularity, who bring great harm to the network environment. In order to effectively regulate the streamer behavior in live video, a strea-mer action recognition method in live video with spatial-temporal attention and deep dictionary learning is proposed in this paper. First, deep features with spatial context are extracted by a spatial attention net-work to focus on action region of streamer after sampling video frames from live video. Then, deep fea-tures of video are fused by assigning weights with a temporal attention network to learn the frame attention from an action. Finally, deep dictionary learning is used to sparsely represent the deep features to further recognize streamer actions. Four experiments are conducted on a real-world dataset, and the competitive results demonstrate that our method can improve the accuracy and speed of streamer action recognition in live video. (c) 2021 Elsevier B.V. All rights reserved.

Keyword :

Streamer Streamer Action recognition Action recognition Live video Live video Spatial-temporal attention Spatial-temporal attention Deep dictionary learning Deep dictionary learning

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Chenhao , Zhang, Jing , Yao, Jiacheng . Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning [J]. | NEUROCOMPUTING , 2021 , 453 : 383-392 .
MLA Li, Chenhao et al. "Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning" . | NEUROCOMPUTING 453 (2021) : 383-392 .
APA Li, Chenhao , Zhang, Jing , Yao, Jiacheng . Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning . | NEUROCOMPUTING , 2021 , 453 , 383-392 .
Export to NoteExpress RIS BibTex
Porn Streamer Recognition in Live Video Based on Multimodal Knowledge Distillation SCIE CSCD
期刊论文 | 2021 , 30 (6) , 1096-1102 | CHINESE JOURNAL OF ELECTRONICS
Abstract&Keyword Cite

Abstract :

Although deep learning has reached a higher accuracy for video content analysis, it is not satisfied with practical application demands of porn streamer recognition in live video because of multiple parameters, complex structures of deep network model. In order to improve the recognition efficiency of porn streamer in live video, a deep network model compression method based on multimodal knowledge distillation is proposed. First, the teacher model is trained with visual-speech deep network to obtain the corresponding porn video prediction score. Second, a lightweight student model constructed with MobileNetV2 and Xception transfers the knowledge from the teacher model by using multimodal knowledge distillation strategy. Finally, porn streamer in live video is recognized by combining the lightweight student model of visualspeech network with the bullet screen text recognition network. Experimental results demonstrate that the proposed method can effectively drop the computation cost and improve the recognition speed under the proper accuracy.

Keyword :

Lightweight student model Lightweight student model Knowledge distillation Knowledge distillation Live video Live video Multimodal Multimodal Porn streamer recognition Porn streamer recognition

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang Liyuan , Zhang Jing , Yao Jiacheng et al. Porn Streamer Recognition in Live Video Based on Multimodal Knowledge Distillation [J]. | CHINESE JOURNAL OF ELECTRONICS , 2021 , 30 (6) : 1096-1102 .
MLA Wang Liyuan et al. "Porn Streamer Recognition in Live Video Based on Multimodal Knowledge Distillation" . | CHINESE JOURNAL OF ELECTRONICS 30 . 6 (2021) : 1096-1102 .
APA Wang Liyuan , Zhang Jing , Yao Jiacheng , Zhuo Li . Porn Streamer Recognition in Live Video Based on Multimodal Knowledge Distillation . | CHINESE JOURNAL OF ELECTRONICS , 2021 , 30 (6) , 1096-1102 .
Export to NoteExpress RIS BibTex
Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention SCIE
期刊论文 | 2021 , 42 (15) , 5754-5773 | INTERNATIONAL JOURNAL OF REMOTE SENSING
WoS CC Cited Count: 1
Abstract&Keyword Cite

Abstract :

Due to the complex background and spatial distribution, it brings great challenge to object detection in high-resolution remote sensing images. In view of the characteristics of various scales, arbitrary orientations, shape variations, and dense arrangement, a multiscale object detection method in high-resolution remote sensing images is proposed by using rotation invariance deep features driven by channel attention. First, a channel attention module is added to our feature fusion and scaling-based single shot detector (FS-SSD) to strengthen the long-term semantic dependence between objects for improving the discriminative ability of the deep features. Then, an oriented response convolution is followed to generate feature maps with orientation channels to produce rotation invariant deep features. Finally, multiscale objects are predicted in a high-resolution remote sensing image by fusing various scale feature maps with multiscale feature module in FS-SSD. Five experiments are conducted on NWPU VHR-10 dataset and achieve better detection performance compared with the state-of-the-art methods.

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhao, Xiaolei , Zhang, Jing , Tian, Jimiao et al. Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention [J]. | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2021 , 42 (15) : 5754-5773 .
MLA Zhao, Xiaolei et al. "Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention" . | INTERNATIONAL JOURNAL OF REMOTE SENSING 42 . 15 (2021) : 5754-5773 .
APA Zhao, Xiaolei , Zhang, Jing , Tian, Jimiao , Zhuo, Li , Zhang, Jie . Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention . | INTERNATIONAL JOURNAL OF REMOTE SENSING , 2021 , 42 (15) , 5754-5773 .
Export to NoteExpress RIS BibTex
10| 20| 50 per page
< Page ,Total 14 >

Export

Results:

Selected

to

Format:
Online/Total:3036/2664101
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.