Query:
学者姓名:卓力
Refining:
Year
Type
Indexed by
Source
Complex
Co-Author
Language
Clean All
Abstract :
Automatic Tea Bud Detection (TBD) is one of the core technologies in intelligent tea-picking systems Since the tea buds are small, dense, highly overlapped, and their colors are close to the background, accurate tea bud detection faces great challenges. In this paper, a tea bud detection method, named as YOLO-TBD, is proposed, which adopts YOLOv8 as the basic framework. Firstly, the Path Aggregation Feature Pyramid Network (PAFPN) in YOLOv8 is improved by incorporating the features from the 2nd layer into the PAFPN network. This modification enables better utilization of low-level features, such as texture and color information, thereby enhancing the network's feature representation ability. Secondly, a Triple-Branch Attention Mechanism (TBAM) is designed and integrated into the output of the backbone network and the C2f module. This attention mechanism strengthens the features of the tea bud objects and suppresses background noise through feature channel interactions, without increasing the model parameters. Finally, a Self-Correction Group Convolution (SCGC) is proposed, which replaces the conventional convolution in the C2f module. This convolution establishes long-range spatial and channel dependencies around each spatial position, enabling a larger receptive field and better contextual information capture with fewer parameters, thereby mitigating false detections and missed detections of tea bud objects. The proposed modules are integrated into the YOLOv8 network architecture, resulting in the construction of three detection models with different parameters, namely YOLO-TBD-L, YOLO-TBD-M and YOLO-TBD-S, respectively. Experimental results on our self-built tea bud detection dataset and the publicly available GWHD_2021 dataset demonstrate that, compared with current methods, the proposed YOLO-TBD-L method can attain a state-of-the-art accuracy, with mAP value reaching 87.04 % and 94.5 %, respectively. And the proposed YOLO-TBD-S model achieves comparable detection accuracy to the YOLOv8-L model with much lower model parameters and computational complexity.
Keyword :
Tea Bud Detection Tea Bud Detection Triple-Branch Attention Mechanism Triple-Branch Attention Mechanism YOLOv8 YOLOv8 Self-Correction Group Convolution Self-Correction Group Convolution
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Zhongyuan , Zhuo, Li , Dong, Chunwang et al. YOLO-TBD: Tea Bud Detection with Triple-Branch Attention Mechanism and Self-Correction Group Convolution [J]. | INDUSTRIAL CROPS AND PRODUCTS , 2025 , 226 . |
MLA | Liu, Zhongyuan et al. "YOLO-TBD: Tea Bud Detection with Triple-Branch Attention Mechanism and Self-Correction Group Convolution" . | INDUSTRIAL CROPS AND PRODUCTS 226 (2025) . |
APA | Liu, Zhongyuan , Zhuo, Li , Dong, Chunwang , Li, Jiafeng . YOLO-TBD: Tea Bud Detection with Triple-Branch Attention Mechanism and Self-Correction Group Convolution . | INDUSTRIAL CROPS AND PRODUCTS , 2025 , 226 . |
Export to | NoteExpress RIS BibTex |
Abstract :
The remarkable performance of Bird's Eye View (BEV) in perception tasks has led to its gradual emergence as a focal point of attention in both industry and academia. Environmental information perception technology represents a core challenge in the field of autonomous driving, and traditional autonomous driving perception algorithms typically perform tasks such as detection, segmentation, and tracking from a frontal or specific viewpoint. As the complexity of sensor parameters configured on vehicles increases, it has become crucial to integrate multi-source information from different sensors and present features in a unified view. BEV perception is favored because it is an intuitive and user-friendly way to fuse information about the surrounding environment and provide an ideal object representation for subsequent planning and control modules. However, BEV perception also faces some key challenges. One such challenge is how to convert from a perspective view to a BEV view while reconstructing lost 3D information. The question of how to obtain accurate ground truth annotations in the BEV grid is of great importance. Similarly, the design of effective methods to integrate features from different sources is a crucial aspect of BEV perception. In this paper, we first discuss the inherent advantages of BEV perception and introduce the mainstream datasets and performance evaluation criteria for BEV perception. Furthermore, we present a comprehensive examination of recent research on BEV perception from four distinct perspectives, exploring a range of solutions, including BEV camera, BEV LiDAR, BEV fusion, and V2V multi-vehicle cooperative BEV perception. Finally, we identify prospective research directions and challenges in this field, with the aim of providing inspiration to related researchers.
Keyword :
Bird's Eye View Bird's Eye View Autonomous driving Autonomous driving Vehicle-to-vehicle communication Vehicle-to-vehicle communication 3D detection and segmentation 3D detection and segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhao, Junhui , Shi, Jingyue , Zhuo, Li . BEV perception for autonomous driving: State of the art and future perspectives [J]. | EXPERT SYSTEMS WITH APPLICATIONS , 2024 , 258 . |
MLA | Zhao, Junhui et al. "BEV perception for autonomous driving: State of the art and future perspectives" . | EXPERT SYSTEMS WITH APPLICATIONS 258 (2024) . |
APA | Zhao, Junhui , Shi, Jingyue , Zhuo, Li . BEV perception for autonomous driving: State of the art and future perspectives . | EXPERT SYSTEMS WITH APPLICATIONS , 2024 , 258 . |
Export to | NoteExpress RIS BibTex |
Abstract :
In recent years, Transformer-based change detection (CD) in remote sensing images has achieved significant advances, making it an emerging hot research topic. However, the current CD methods suffer from some problems, such as incomplete detection of change regions and missed detection of small change regions. In this article, a global context-aware Transformer is proposed for CD tasks, named GCFormer, to address above issues by efficiently enhancing the global context information. It is fulfilled from two aspects based on the hybrid convolutional neural network (CNN) + Transformer framework. First, a multireceptive-field Conv-Attention (MRFCA) mechanism is designed, which combines dilated convolutions with multiple rates (DCMRs) and Conv-Attention, fully leveraging the advantages of convolution operation and self-attention mechanism. It is embedded at the highest layer of CNN to extract multireceptive-field global context information. Second, a context-aware relative position encoding (CRPE) mode is proposed to replace the absolute position encoding (APE) mode of Transformer. As a result, it can capture long-range dependency more efficiently and further enhance the global context information extraction and representation ability of the network. Experimental results on three public benchmark datasets of LEVIR-CD, WHU-CD, and DSIFN-CD show that, the proposed GCFormer achieves superior detection performance with lower model complexity than the state-of-the-art (SOTA) Transformer-based CD methods. The source code is available at https://github.com/yuwanting828/yuwanting828.github.io.
Keyword :
Convolutional neural networks Convolutional neural networks global context-aware Transformer global context-aware Transformer Task analysis Task analysis Computer architecture Computer architecture Remote sensing Remote sensing Feature extraction Feature extraction long-range dependency long-range dependency Semantics Semantics remote sensing images remote sensing images Change detection (CD) Change detection (CD) Transformers Transformers
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yu, Wanting , Zhuo, Li , Li, Jiafeng . GCFormer: Global Context-Aware Transformer for Remote Sensing Image Change Detection [J]. | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2024 , 62 . |
MLA | Yu, Wanting et al. "GCFormer: Global Context-Aware Transformer for Remote Sensing Image Change Detection" . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 62 (2024) . |
APA | Yu, Wanting , Zhuo, Li , Li, Jiafeng . GCFormer: Global Context-Aware Transformer for Remote Sensing Image Change Detection . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2024 , 62 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Due to the problem of a small amount of EEG samples and relatively high dimensionality of electroencephalogram (EEG) features, feature selection plays an essential role in EEG-based emotion recognition. However, current EEG-based emotion recognition studies utilize a problem transformation approach to transform multi-dimension emotional labels into single-dimension labels, and then implement commonly used single-label feature selection methods to search feature subsets, which ignores the relations between different emotional dimensions. To tackle the problem, we propose an efficient EEG feature selection method for multi-dimension emotion recognition (EFSMDER) via local and global label relevance. First, to capture the local label correlations, EFSMDER implements orthogonal regression to map the original EEG feature space into a low-dimension space. Then, it employs the global label correlations in the original multi-dimension emotional label space to effectively construct the label information in the low-dimension space. With the aid of local and global relevance information, EFSMDER can conduct representational EEG feature subset selection. Three EEG emotional databases with multi-dimension emotional labels were used for performance comparison between EFSMDER and fourteen state-of-the-art methods, and the EFSMDER method achieves the best multi-dimension classification accuracies of 86.43, 84.80, and 97.86 percent on the DREAMER, DEAP, and HDED datasets, respectively.
Keyword :
global relevance global relevance feature selection feature selection Termination of employment Termination of employment Task analysis Task analysis Indexes Indexes Emotion recognition Emotion recognition Electroencephalogram Electroencephalogram multi-dimension emotional labels multi-dimension emotional labels Electroencephalography Electroencephalography Correlation Correlation Feature extraction Feature extraction
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xu, Xueyuan , Wei, Fulin , Jia, Tianyuan et al. Embedded EEG Feature Selection for Multi-Dimension Emotion Recognition via Local and Global Label Relevance [J]. | IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING , 2024 , 32 : 514-526 . |
MLA | Xu, Xueyuan et al. "Embedded EEG Feature Selection for Multi-Dimension Emotion Recognition via Local and Global Label Relevance" . | IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING 32 (2024) : 514-526 . |
APA | Xu, Xueyuan , Wei, Fulin , Jia, Tianyuan , Zhuo, Li , Zhang, Hui , Li, Xiaoguang et al. Embedded EEG Feature Selection for Multi-Dimension Emotion Recognition via Local and Global Label Relevance . | IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING , 2024 , 32 , 514-526 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Visual tracking is widely used in industrial systems such as vision servo systems and in intelligent robots. However, most tracking algorithms are designed without considering the balance of algorithmic efficiency and accuracy in system applications, making them less preferable for applications. This paper proposes a siamese global location-aware object tracking algorithm (SiamGLA) to address this issue. First, due to the limited performance of efficient lightweight backbone networks, this study designs an internal feature combination (IFC) module that improves feature representation with almost no additional parameters. Second, a global-aware (GA) attention module is proposed to improve the classification ability of foreground and background, which is especially important for trackers. Finally, a location-aware (LA) attention module is designed to improve the regression performance of the tracking framework. Comprehensive experiments show that SiamGLA is effective, and overcomes the drawbacks of poor robustness and weak generalization ability. When the performance reaches state-of-the-art, SiamGLA requires fewer calculations and parameters, making it more likely to be applied in practice.
Keyword :
Global location-aware Global location-aware Vision-based object tracking Vision-based object tracking Lightweight network Lightweight network
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Jiafeng , Li, Bin , Ding, Guodong et al. Siamese global location-aware network for visual object tracking [J]. | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2023 , 14 (10) : 3607-3620 . |
MLA | Li, Jiafeng et al. "Siamese global location-aware network for visual object tracking" . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 14 . 10 (2023) : 3607-3620 . |
APA | Li, Jiafeng , Li, Bin , Ding, Guodong , Zhuo, Li . Siamese global location-aware network for visual object tracking . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2023 , 14 (10) , 3607-3620 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Tongue color is an important tongue diagnostic index for traditional Chinese medicine (TCM). Due to the individual experience of TCM experts as well as ambiguous boundaries among the tongue color categories, there often exist noisy labels in annotated samples. Deep neural networks trained with the noisy labeled samples often have poor generalization capability because they easily overfit on noisy labels. A novel framework named confident-learning-assisted knowledge distillation (CLA-KD) is proposed for tongue color classification with noisy labels. In this framework, the teacher network plays two important roles. On the one hand, it performs confident learning to identify, cleanse and correct noisy labels. On the other hand, it learns the knowledge from the clean labels, which will then be transferred to the student network to guide its training. Moreover, we elaborately design a teacher network in an ensemble manner, named E-CA(2)-ResNet18, to solve the unreliability and instability problem resulted from the insufficient data samples. E-CA(2)-ResNet18 adopts ResNet18 as the backbone, and integrates channel attention (CA) mechanism and activate or not activation function together, which facilitates to yield a better performance. The experimental results on three self-established TCM tongue datasets demonstrate that, our proposed CLA-KD can obtain a superior classification accuracy and good robustness with a lower network model complexity, reaching 94.49%, 92.21%, 93.43% on the three tongue image datasets, respectively.
Keyword :
Image color analysis Image color analysis Neural networks Neural networks Traditional Chinese medicine Traditional Chinese medicine Tongue Tongue Knowledge engineering Knowledge engineering Robustness Robustness Tongue color classification Tongue color classification Knowledge distillation Knowledge distillation Training Training Channel attention mechanism Channel attention mechanism Deep learning Deep learning Learning from noisy labels Learning from noisy labels Confident learning Confident learning ResNet18 ResNet18
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Yanping , Zhuo, Li , Sun, Liangliang et al. Tongue Color Classification in TCM with Noisy Labels via Confident-Learning-Assisted Knowledge Distillation [J]. | CHINESE JOURNAL OF ELECTRONICS , 2023 , 32 (1) : 140-150 . |
MLA | Li, Yanping et al. "Tongue Color Classification in TCM with Noisy Labels via Confident-Learning-Assisted Knowledge Distillation" . | CHINESE JOURNAL OF ELECTRONICS 32 . 1 (2023) : 140-150 . |
APA | Li, Yanping , Zhuo, Li , Sun, Liangliang , Zhang, Hui , Li, Xiaoguang , Yang, Yang et al. Tongue Color Classification in TCM with Noisy Labels via Confident-Learning-Assisted Knowledge Distillation . | CHINESE JOURNAL OF ELECTRONICS , 2023 , 32 (1) , 140-150 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Prohibited Object Detection (POD) in X-ray images plays an important role in protecting public safety. Automatic and accurate POD is required to relieve the working pressure of security inspectors. However, the existing methods cannot obtain a satisfactory detection accuracy, and especially, the prob-lem of object occlusion also has not been solved well. Therefore, in this paper, according to the specific characteristics of X-ray images as well as low-level and high-level features of Convolutional Neural Network (CNN), different feature enhancement strategies have been elaborately designed for occluded POD. First, a learnable Gabor convolutional layer is designed and embedded into the low layer of the net-work to enhance the network's capability to capture the edge and contour information of object. A Spatial Attention (SA) mechanism is then designed to weight the output features of the Gabor convolutional layer to enhance the spatial structure information of object and suppress the background noises simul-taneously. For the high-level features, Global Context Feature Extraction (GCFE) module is proposed to extract multi-scale global contextual information of object. And, a Dual Scale Feature Aggregation (DSFA) module is proposed to fuse these global features with those of another layer. To verify the effec-tiveness of the proposed modules, they are embedded into typical one-stage and two-stage object detec-tion frameworks, i.e., Faster R-CNN and YOLO v5L, obtaining POD-F and POD-Y methods, respectively. The proposed methods have been extensively evaluated on three publicly available benchmark datasets, namely SIXray, OPIXray and WIXray. The experimental results show that, compared with existing meth-ods, the proposed POD-Y method can achieve a state-of-the-art detection accuracy. And POD-F can also achieve a competitive detection performance among the two-stage detection methods.1 (c) 2022 Elsevier B.V. All rights reserved.
Keyword :
Dual Scale Feature Aggregation Dual Scale Feature Aggregation Gabor Convolution Gabor Convolution X-ray Image X-ray Image Global Context Feature Extraction Global Context Feature Extraction Occluded Prohibited Object Detection Occluded Prohibited Object Detection
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Ma, Chunjie , Zhuo, Li , Li, Jiafeng et al. Occluded prohibited object detection in X-ray images with global Context-aware Multi-Scale feature Aggregation [J]. | NEUROCOMPUTING , 2023 , 519 : 1-16 . |
MLA | Ma, Chunjie et al. "Occluded prohibited object detection in X-ray images with global Context-aware Multi-Scale feature Aggregation" . | NEUROCOMPUTING 519 (2023) : 1-16 . |
APA | Ma, Chunjie , Zhuo, Li , Li, Jiafeng , Zhang, Yutong , Zhang, Jing . Occluded prohibited object detection in X-ray images with global Context-aware Multi-Scale feature Aggregation . | NEUROCOMPUTING , 2023 , 519 , 1-16 . |
Export to | NoteExpress RIS BibTex |
Abstract :
The visual navigation system is an important module in intelligent unmanned aerial vehicle (UAV) systems as it helps to guide them autonomously by tracking visual targets. In recent years, tracking algorithms based on Siamese networks have demonstrated outstanding performance. However, their application to UAV systems has been challenging due to the limited resources available in such systems.This paper proposes a simple and efficient tracking network called the Siamese Pruned ResNet Attention (SiamPRA) network and applied to embedded platforms that can be deployed on UAVs. SiamPRA is base on the SiamFC network and incorporates ResNet-24 as its backbone. It also utilizes the spatial-channel attention mechanism, thereby achieving higher accuracy while reducing the number of computations. Further, sparse training and pruning are used to reduce the size of the model while maintaining high precision. Experimental results on the challenging benchmarks VOT2018, UAV123 and OTB100 show that SiamPRA has a higher accuracy and lower complexity than other tracking networks.
Keyword :
embedded vision system embedded vision system attention mechanism attention mechanism object tracking object tracking network compression network compression
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Jiafeng , Zhang, Kang , Gao, Zheng et al. SiamPRA: An Effective Network for UAV Visual Tracking [J]. | ELECTRONICS , 2023 , 12 (11) . |
MLA | Li, Jiafeng et al. "SiamPRA: An Effective Network for UAV Visual Tracking" . | ELECTRONICS 12 . 11 (2023) . |
APA | Li, Jiafeng , Zhang, Kang , Gao, Zheng , Yang, Liheng , Zhuo, Li . SiamPRA: An Effective Network for UAV Visual Tracking . | ELECTRONICS , 2023 , 12 (11) . |
Export to | NoteExpress RIS BibTex |
Abstract :
In clinical practice, automatic polyp segmentation from colonoscopy images is an effective assistant manner in the early detection and prevention of colorectal cancer. This paper proposed a new deep model for accurate polyp segmentation based on an encoder-decoder framework. ResNet50 is adopted as the encoder, and three functional modules are introduced to improve the performance. Firstly, a hybrid channel-spatial attention module is introduced to reweight the encoder features spatially and channel-wise, enhancing the critical features for the segmentation task while suppressing irrelevant ones. Secondly, a global context pyramid feature extraction module and a series of global context flows are proposed to extract and deliver the global context information. The former captures the multi-scale and multi-receptive-field global context information, while the latter explicitly transmits the global context information to each decoder level. Finally, a feature fusion module is designed to effectively incorporate the high-level features, low-level features, and global context information, considering the gaps between different features. These modules help the model fully exploit the global context information to deduce the complete polyp regions. Extensive experiments are conducted on five public colorectal polyp datasets. The results demonstrate that the proposed network has powerful learning and generalization capability, significantly improving segmentation accuracy and outperforming state-of-the-art methods.
Keyword :
Hybrid channel-spatial attention Hybrid channel-spatial attention Polyp segmentation Polyp segmentation Feature fusion Feature fusion Global context-aware pyramid feature extraction Global context-aware pyramid feature extraction
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Huang, Xiaodong , Zhuo, Li , Zhang, Hui et al. Polyp segmentation network with hybrid channel-spatial attention and pyramid global context guided feature fusion [J]. | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS , 2022 , 98 . |
MLA | Huang, Xiaodong et al. "Polyp segmentation network with hybrid channel-spatial attention and pyramid global context guided feature fusion" . | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS 98 (2022) . |
APA | Huang, Xiaodong , Zhuo, Li , Zhang, Hui , Yang, Yang , Li, Xiaoguang , Zhang, Jing et al. Polyp segmentation network with hybrid channel-spatial attention and pyramid global context guided feature fusion . | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS , 2022 , 98 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Background and Objective: : Automatic skin lesion segmentation plays an important role in computer -aided diagnosis of skin diseases. However, current segmentation networks cannot accurately detect the boundaries of the skin lesion areas.Methods: : In this paper, a boundary learning assisted network for skin lesion segmentation is proposed, namely BLA-Net, which adopts ResNet34 as backbone network under an encoder-decoder framework. The overall architecture is divided into two key components: Primary Segmentation Network (PSNet) and Auxiliary Boundary Learning Network (ABLNet). PSNet is to locate the skin lesion areas. Dynamic Deformable Convolution is introduced into the lower layer of the encoder, so that the network can ef-fectively deal with complex skin lesion objects. And a Global Context Information Extraction Module is proposed and embedded into the high layer of the encoder to capture multi-receptive field and multi -scale global context features. ABLNet is to finely detect the boundaries of skin lesion area based on the low-level features of the encoder, in which an object regional attention mechanism is proposed to en-hance the features of lesion object area and suppress those of irrelevant regions. ABLNet can assist the PSNet to realize accurate skin lesion segmentation.Results: : We verified the segmentation performance of the proposed method on the two public der-moscopy datasets, namely ISBI 2016 and ISIC 2018. The experimental results show that our proposed method can achieve the Jaccard Index of 86.6%, 84.8% and the Dice Coefficient of 92.4%, 91.2% on ISBI 2016 and ISIC 2018 datasets, respectively.Conclusions: : Compared with existing methods, the proposed method can achieve the state-of-the-arts segmentation accuracy with less model parameters, which can assist dermatologists in clinical diagnosis and treatment.(c) 2022 Published by Elsevier B.V.
Keyword :
Auxiliary boundary learning network Auxiliary boundary learning network Skin lesion segmentation Skin lesion segmentation Dynamic deformable convolution Dynamic deformable convolution Global context information extraction  Global context information extraction  module module
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Feng, Ruiqi , Zhuo, Li , Li, Xiaoguang et al. BLA-Net:Boundary learning assisted network for skin lesion segmentation [J]. | COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE , 2022 , 226 . |
MLA | Feng, Ruiqi et al. "BLA-Net:Boundary learning assisted network for skin lesion segmentation" . | COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 226 (2022) . |
APA | Feng, Ruiqi , Zhuo, Li , Li, Xiaoguang , Yin, Hongxia , Wang, Zhenchang . BLA-Net:Boundary learning assisted network for skin lesion segmentation . | COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE , 2022 , 226 . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |