Query:
学者姓名:孔德慧
Refining:
Year
Type
Indexed by
Source
Complex
Former Name
Co-Author
Language
Clean All
Abstract :
Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D lifting approaches have achieved remarkable improvements. However, monocular 3D HPE is still a challenging problem due to the inherent depth ambiguities and occlusions. Recently, diffusion models have achieved great success in the field of image generation. Inspired by this, we transform 3D human pose estimation problem into a reverse diffusion process, and propose a dual-branch diffusion model so as to handle the indeterminacy and uncertainty of 3D pose and fully explore the global and local correlations between joints. Furthermore, we propose conditional dual-branch diffusion model to enhance the performance of 3D human pose estimation, in which the joint-level semantic information are regarded as the condition of the diffusion model, and integrated into the joint-level representations of 2D pose to enhance the expression of joints. The proposed method is verified on two widely used datasets and the experimental results have demonstrated the superiority.
Keyword :
Human pose estimation Human pose estimation Diffusion model Diffusion model Joint semantics Joint semantics Dual-branch Dual-branch
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Jinghua , Bai, Zhuowei , Kong, Dehui et al. 3d human pose estimation based on conditional dual-branch diffusion [J]. | MULTIMEDIA SYSTEMS , 2025 , 31 (1) . |
MLA | Li, Jinghua et al. "3d human pose estimation based on conditional dual-branch diffusion" . | MULTIMEDIA SYSTEMS 31 . 1 (2025) . |
APA | Li, Jinghua , Bai, Zhuowei , Kong, Dehui , Chen, Dongpan , Li, Qianxing , Yin, Baocai . 3d human pose estimation based on conditional dual-branch diffusion . | MULTIMEDIA SYSTEMS , 2025 , 31 (1) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Human pose estimation based on monocular video has always been the focus of research in the human computer interaction community, which suffers mainly from depth ambiguity and self-occlusion challenges. While the recently proposed learning-based approaches have demonstrated promising performance, they do not fully explore the complementarity of features. In this paper, the authors propose a novel multi-feature and multi-level fusion network (MMF-Net), which extracts and combines joint features, bone features and trajectory features at multiple levels to estimate 3D human pose. In MMF-Net, firstly, the bone length estimation module and the trajectory multi-level fusion module are used to extract the geometric size information of the human body and multi-level trajectory information of human motion, respectively. Then, the fusion attention-based combination (FABC) module is used to extract multi-level topological structure information of the human body, and effectively fuse topological structure information, geometric size information and trajectory information. Extensive experiments show that MMF-Net achieves competitive results on Human3.6M, HumanEva-I and MPI-INF-3DHP datasets.
Keyword :
image processing image processing pose estimation pose estimation computer vision computer vision image reconstruction image reconstruction
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Qianxing , Kong, Dehui , Li, Jinghua et al. MMF-Net: A novel multi-feature and multi-level fusion network for 3D human pose estimation [J]. | IET COMPUTER VISION , 2025 , 19 (1) . |
MLA | Li, Qianxing et al. "MMF-Net: A novel multi-feature and multi-level fusion network for 3D human pose estimation" . | IET COMPUTER VISION 19 . 1 (2025) . |
APA | Li, Qianxing , Kong, Dehui , Li, Jinghua , Yin, Baocai . MMF-Net: A novel multi-feature and multi-level fusion network for 3D human pose estimation . | IET COMPUTER VISION , 2025 , 19 (1) . |
Export to | NoteExpress RIS BibTex |
Abstract :
本发明公开了一种基于人体交互意图信息的层级人物交互检测方法,分为1)目标检测:检测输入图像中的所有对象实例。2)人物交互检测:对图像中所有的对实例进行人物交互检测。通过视觉特征的设计抽象出人体注视信息来建模交互参与者关注的上下文区域;提出面向人体交互意图的人体姿态图构建,优化身体运动对交互检测的差异信息;使用人和物体之间的距离‑特征作为引导视觉距离特征的优化,提升人物交互检测算法的性能。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 王帅 , 李敬华 et al. 一种基于人体交互意图信息的层级人物交互检测方法 : CN202310266335.7[P]. | 2023-03-20 . |
MLA | 孔德慧 et al. "一种基于人体交互意图信息的层级人物交互检测方法" : CN202310266335.7. | 2023-03-20 . |
APA | 孔德慧 , 王帅 , 李敬华 , 尹宝才 . 一种基于人体交互意图信息的层级人物交互检测方法 : CN202310266335.7. | 2023-03-20 . |
Export to | NoteExpress RIS BibTex |
Abstract :
本发明公开了一种基于超图注意力的人体网格重建方法,提出基于超图的人体网格分层表示来形成含部件语义的人体网格表示模型,这种新的表示模型为人体网格重建提供结构基础;通过构建Body2Parts特征转移模块实现部件间特征的汇聚和与图像信息的融合,从部件层级去进行信息交互和融合,可支持部件层次的高质量人体重建;通过提出Part2Vertices特征转移模块实现部件特征和顶点特征的转移,以及利用超图注意力来细化顶点级的特征,以顶点为表示单元,在部件内进行特征传递,支持网格点层次的精细化人体重建。基于层级化人体网格表示模型的层级化重建方法,本发明实现了三维人体网格重建精度和计算代价的高性能折衷。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 郝晨辉 , 李敬华 et al. 一种基于超图注意力的人体网格重建方法 : CN202310600839.8[P]. | 2023-05-25 . |
MLA | 孔德慧 et al. "一种基于超图注意力的人体网格重建方法" : CN202310600839.8. | 2023-05-25 . |
APA | 孔德慧 , 郝晨辉 , 李敬华 , 尹宝才 . 一种基于超图注意力的人体网格重建方法 : CN202310600839.8. | 2023-05-25 . |
Export to | NoteExpress RIS BibTex |
Abstract :
本发明公开了一种基于空间骨架信息的手绘草图三维模型重建方法,提出空间骨架引导编码器、域自适应编码器和自注意力解码器,通过空间骨架编码器提取草图的骨架特征,骨架信息作为一种先验知识来提供重建完整三维模型所需的辅助信息,域自适应编码器将合成草图学习到的知识迁移到手绘草图中,基于注意力的解码器消除歧义性,本方法提升了单张手绘草图的三维重建精度。自注意力机制使得模型区分轮廓相似度较高的草图输入;相对于其他技术使用判别器与梯度反转层的域自适应方法,其训练的值函数相当于最小化两个分布之间的Jensen‑Shannon散度,因为最小化的散度对于生成器参数来说可能不是连续的,而本发明的域自适应约束函数可被认为处处可微,训练更加稳定。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 马杨 , 李敬华 et al. 一种基于空间骨架信息的手绘草图三维模型重建方法 : CN202310163381.4[P]. | 2023-02-24 . |
MLA | 孔德慧 et al. "一种基于空间骨架信息的手绘草图三维模型重建方法" : CN202310163381.4. | 2023-02-24 . |
APA | 孔德慧 , 马杨 , 李敬华 , 尹宝才 . 一种基于空间骨架信息的手绘草图三维模型重建方法 : CN202310163381.4. | 2023-02-24 . |
Export to | NoteExpress RIS BibTex |
Abstract :
本发明公开了一种多视角加权聚合的三维点云重建方法,由非局部特征提取器对输入图像进行处理得到特征图,利用单应性变换对特征图进行变换以生成多个成本体,通过轻量化的加权聚合模块将成本体之间的三维关系进行编码生成一个三维成本体,使用边缘语义引导的伪三维卷积回归网络对三维成本体进行深度回归得到多视角深度图,最后利用相机矩阵参数反映射计算生成点云。通过提出基于空洞卷积的非局部特征提取器来提升点云的完整度;通过构建轻量化的加权聚合模块来提升点云的精度并降低网络的计算量;通过提出边缘语义引导的伪三维卷积网络来提升多视角三维重建的重建精度和降低对硬件的要求。
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 张少杰 , 李敬华 et al. 一种多视角加权聚合的三维点云重建方法 : CN202310195559.3[P]. | 2023-02-24 . |
MLA | 孔德慧 et al. "一种多视角加权聚合的三维点云重建方法" : CN202310195559.3. | 2023-02-24 . |
APA | 孔德慧 , 张少杰 , 李敬华 , 尹宝才 . 一种多视角加权聚合的三维点云重建方法 : CN202310195559.3. | 2023-02-24 . |
Export to | NoteExpress RIS BibTex |
Abstract :
弱监督时序动作定位任务的目标是在只有视频级标签的情况下,对未分割的视频中的动作进行分类和时序上的定位。目前基于神经网络模型的方法,大多训练分类器以预测视频片段级的类别分数,再融合其为视频级的类别分数。这些方法只关注视频的视觉特征,却忽视了视频语义结构信息。为进一步提升视频动作定位的质量,本文提出了一种视频语义结构信息辅助的弱监督时序动作定位方法。该方法首先以分类模块作为基础模型,然后基于视频在时序结构上的稀疏性和语义连续性等辅助信息设计一种平滑注意力模块,修正分类结果;另外,加入视频片段级语义标签预测模块,改善弱监督标签信息不充足问题;最后将三个模块共同训练以融合提升时序动作定位的精度。通过在THUMOS14和ActivityNet数据集上的实验,表明本文方法的性能指标明显优于目前现有方法。
Keyword :
语义结构信息 语义结构信息 伪标签 伪标签 注意力值 注意力值 动作定位 动作定位 弱监督 弱监督
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 许梦文 , 李敬华 et al. 一种视频语义结构信息辅助的弱监督时序动作定位方法 [C] //2021中国自动化大会论文集 . 2022 . |
MLA | 孔德慧 et al. "一种视频语义结构信息辅助的弱监督时序动作定位方法" 2021中国自动化大会论文集 . (2022) . |
APA | 孔德慧 , 许梦文 , 李敬华 , 王少帆 , 尹宝才 . 一种视频语义结构信息辅助的弱监督时序动作定位方法 2021中国自动化大会论文集 . (2022) . |
Export to | NoteExpress RIS BibTex |
Abstract :
3D pose estimation remains a challenging task since human poses exhibit high ambiguity and multigranularity. Traditional graph convolution networks (GCNs) accomplish the task by modeling all skeletons as an entire graph, and are unable to fuse combinable part-based features. By observing that human movements occur due to part of human body (i.e. related skeletons and body components, known as the poselet) and those poselets contribute to each movement in a hierarchical fashion, we propose a hierarchical poselet-guided graph convolutional network (HPGCN) for 3D pose estimation from 2D poses. HPGCN sets five primitives of human body as basic poselets, and constitutes high-level poselets according to the kinematic configuration of human body. Moreover, HPGCN forms a fundamental unit by using a diagonally dominant graph convolution layer and a non-local layer, which corporately capture the multi-granular feature of human poses from local to global perspective. Finally HPGCN designs a geometric constraint loss function with constraints on lengths and directions of bone vectors, which help produce reasonable pose regression. We verify the effectiveness of HPGCN on three public 3D human pose benchmarks. Experimental results show that HPGCN outperforms several state-of-the-art methods. (c) 2021 Elsevier B.V.
Keyword :
Graph convolutional network Graph convolutional network Hierarchical poselet Hierarchical poselet Geometric constraint Geometric constraint Diagonally dominant graph convolution Diagonally dominant graph convolution Pose estimation Pose estimation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, Yongpeng , Kong, Dehui , Wang, Shaofan et al. HPGCN: Hierarchical poselet-guided graph convolutional network for 3D pose estimation [J]. | NEUROCOMPUTING , 2022 , 487 : 243-256 . |
MLA | Wu, Yongpeng et al. "HPGCN: Hierarchical poselet-guided graph convolutional network for 3D pose estimation" . | NEUROCOMPUTING 487 (2022) : 243-256 . |
APA | Wu, Yongpeng , Kong, Dehui , Wang, Shaofan , Li, Jinghua , Yin, Baocai . HPGCN: Hierarchical poselet-guided graph convolutional network for 3D pose estimation . | NEUROCOMPUTING , 2022 , 487 , 243-256 . |
Export to | NoteExpress RIS BibTex |
Abstract :
弱监督时序动作定位任务的目标是在只有视频级标签的情况下,对未分割的视频中的动作进行分类和时序上的定位。目前基于神经网络模型的方法,大多训练分类器以预测视频片段级的类别分数,再融合其为视频级的类别分数。这些方法只关注视频的视觉特征,却忽视了视频语义结构信息。为进一步提升视频动作定位的质量,本文提出了一种视频语义结构信息辅助的弱监督时序动作定位方法。该方法首先以分类模块作为基础模型,然后基于视频在时序结构上的稀疏性和语义连续性等辅助信息设计一种平滑注意力模块,修正分类结果;另外,加入视频片段级语义标签预测模块,改善弱监督标签信息不充足问题;最后将三个模块共同训练以融合提升时序动作定位的精度。通过在THUMOS14和ActivityNet数据集上的实验,表明本文方法的性能指标明显优于目前现有方法。
Keyword :
语义结构信息 语义结构信息 弱监督 弱监督 动作定位 动作定位 注意力值 注意力值 伪标签 伪标签
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 孔德慧 , 许梦文 , 李敬华 et al. 一种视频语义结构信息辅助的弱监督时序动作定位方法 [C] //2021中国自动化大会——中国自动化学会60周年会庆暨纪念钱学森诞辰110周年 . 2021 . |
MLA | 孔德慧 et al. "一种视频语义结构信息辅助的弱监督时序动作定位方法" 2021中国自动化大会——中国自动化学会60周年会庆暨纪念钱学森诞辰110周年 . (2021) . |
APA | 孔德慧 , 许梦文 , 李敬华 , 王少帆 , 尹宝才 . 一种视频语义结构信息辅助的弱监督时序动作定位方法 2021中国自动化大会——中国自动化学会60周年会庆暨纪念钱学森诞辰110周年 . (2021) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Multi-view human action recognition remains a challenging problem due to large view changes. In this article, we propose a transfer learning-based framework called transferable dictionary learning and view adaptation (TDVA) model for multi-view human action recognition. In the transferable dictionary learning phase, TDVA learns a set of view-specific transferable dictionaries enabling the same actions from different views to share the same sparse representations, which can transfer features of actions from different views to an intermediate domain. In the view adaptation phase, TDVA comprehensively analyzes global, local, and individual characteristics of samples, and jointly learns balanced distribution adaptation, locality preservation, and discrimination preservation, aiming at transferring sparse features of actions of different views from the intermediate domain to a common domain. In other words, TDVA progressively bridges the distribution gap among actions from various views by these two phases. Experimental results on IXMAS, ACT4(2), and NUCLA action datasets demonstrate that TDVA outperforms state-of-the-art methods.
Keyword :
sparse representation sparse representation Action recognition Action recognition transfer learning transfer learning multi-view multi-view
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Sun, Bin , Kong, Dehui , Wang, Shaofan et al. Joint Transferable Dictionary Learning and View Adaptation for Multi-view Human Action Recognition [J]. | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2021 , 15 (2) . |
MLA | Sun, Bin et al. "Joint Transferable Dictionary Learning and View Adaptation for Multi-view Human Action Recognition" . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 15 . 2 (2021) . |
APA | Sun, Bin , Kong, Dehui , Wang, Shaofan , Wang, Lichun , Yin, Baocai . Joint Transferable Dictionary Learning and View Adaptation for Multi-view Human Action Recognition . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2021 , 15 (2) . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |