• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:马伟

Refining:

Language

Submit

Clean All

Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 3 >
Structured 3D gaussian splatting for novel view synthesis based on single RGB-LiDAR View SCIE
期刊论文 | 2025 , 55 (7) | APPLIED INTELLIGENCE
Abstract&Keyword Cite

Abstract :

3D scene reconstruction is a critical task in computer vision and graphics, with recent advancements in 3D Gaussian Splatting (3DGS) demonstrating impressive novel view synthesis (NVS) result. However, most 3DGS methods rely on multi-view images, which are not always available, particularly in outdoor environments. In this paper, we explore 3D scene reconstruction using only single-view data, comprising an RGB image and sparse point clouds from a LiDAR sensor. To address the challenges posed by limited reference and LiDAR sensor insufficient point clouds, we propose a voxel-based structured 3DGS framework enhanced with depth prediction. We introduce a novel depth prior guided voxel growing and pruning algorithm, which leverages predicted depth maps to refine scene structure and improve rendering quality. Furthermore, we design a virtual background fitting method with an adaptive voxel size to accommodate the sparse distribution of LiDAR data in outdoor scenes. Our approach surpasses existing methods, including Scaffold-GS, Gaussian-Pro, 3DGS, Mip-splatting and UniDepth, in terms of PSNR, SSIM, LPIPS and FID metrics on the KITTI and Waymo datasets, demonstrating its effectiveness in single-viewpoint 3D reconstruction and NVS.

Keyword :

Gaussian Splatting Gaussian Splatting 3D Reconstruction 3D Reconstruction 3DGS 3DGS Novel View Synthesis Novel View Synthesis

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Liu, Libin , Zhao, Zhiqun , Ma, Wei et al. Structured 3D gaussian splatting for novel view synthesis based on single RGB-LiDAR View [J]. | APPLIED INTELLIGENCE , 2025 , 55 (7) .
MLA Liu, Libin et al. "Structured 3D gaussian splatting for novel view synthesis based on single RGB-LiDAR View" . | APPLIED INTELLIGENCE 55 . 7 (2025) .
APA Liu, Libin , Zhao, Zhiqun , Ma, Wei , Zhang, Siyuan , Zha, Hongbin . Structured 3D gaussian splatting for novel view synthesis based on single RGB-LiDAR View . | APPLIED INTELLIGENCE , 2025 , 55 (7) .
Export to NoteExpress RIS BibTex
Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation SCIE
期刊论文 | 2024 , 255 | EXPERT SYSTEMS WITH APPLICATIONS
WoS CC Cited Count: 5
Abstract&Keyword Cite

Abstract :

Complementarily fusing RGB and depth images while effectively suppressing task-irrelevant noise is crucial for achieving accurate indoor RGB-D semantic segmentation. In this paper, we propose a novel deep model that leverages dual-modal non-local context to guide the aggregation of complementary features and the suppression of noise at multiple stages. Specifically, we introduce a dual-modal non-local context encoding (DNCE) module to learn global representations for each modality at each stage, which are then utilized to facilitate crossmodal complementary clue aggregation (CCA). Subsequently, the enhanced features from both modalities are merged together. Additionally, we propose a semantic guided feature rectification (SGFR) module to exploit rich semantic clues in the top-level merged features for suppressing noise in the lower-stage merged features. Both the DNCE-CCA and the SGFR modules provide dual-modal global views that are essential for effective RGB-D fusion. Experimental results on two public indoor datasets, NYU Depth V2 and SUN-RGBD, demonstrate that our proposed method outperforms state-of-the-art models of similar complexity.

Keyword :

Dual-modal non-local context Dual-modal non-local context Semantic segmentation Semantic segmentation Feature rectification Feature rectification aggregation aggregation Cross-modal complementary feature Cross-modal complementary feature

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Guo, Xiangyu , Ma, Wei , Liang, Fangfang et al. Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation [J]. | EXPERT SYSTEMS WITH APPLICATIONS , 2024 , 255 .
MLA Guo, Xiangyu et al. "Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation" . | EXPERT SYSTEMS WITH APPLICATIONS 255 (2024) .
APA Guo, Xiangyu , Ma, Wei , Liang, Fangfang , Mi, Qing . Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation . | EXPERT SYSTEMS WITH APPLICATIONS , 2024 , 255 .
Export to NoteExpress RIS BibTex
DSC-MDE: Dual structural contexts for monocular depth estimation SCIE
期刊论文 | 2023 , 263 | KNOWLEDGE-BASED SYSTEMS
WoS CC Cited Count: 3
Abstract&Keyword Cite

Abstract :

Geometric and semantic contexts are essential to solving the ill-posed problem of monocular depth estimation (MDE). In this paper, we propose a deep MDE framework that can aggregate dual-modal structural contexts for monocular depth estimation (DSC-MDE). First, a cross-shaped context (CSC) aggregation module is developed to globally encode the geometric structures in depth maps observed under the fields of vision of robots/autonomous vehicles. Next, the CSC-encoded geometric features are further modulated with semantic context in an object-regional context (ORC) aggregation module. Finally, to train the proposed network, we present a focal ordinal loss (FOL), which pays more attention to distant samples to avoid the issue of over-relaxed constraints on these samples occurring in the ordinal regression loss (ORL). We compare the proposed model to recent methods with geometric and multi-modal contexts, and show that the proposed model obtains state-of-the-art performance on both indoor and outdoor datasets, including NYU-Depth-V2, Cityscapes and KITTI. (c) 2023 Elsevier B.V. All rights reserved.

Keyword :

Structural context Structural context Context aggregation Context aggregation Multi-modal context fusion Multi-modal context fusion Monocular depth estimation Monocular depth estimation

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Yan, Wubin , Dong, Lijun , Ma, Wei et al. DSC-MDE: Dual structural contexts for monocular depth estimation [J]. | KNOWLEDGE-BASED SYSTEMS , 2023 , 263 .
MLA Yan, Wubin et al. "DSC-MDE: Dual structural contexts for monocular depth estimation" . | KNOWLEDGE-BASED SYSTEMS 263 (2023) .
APA Yan, Wubin , Dong, Lijun , Ma, Wei , Mi, Qing , Zha, Hongbin . DSC-MDE: Dual structural contexts for monocular depth estimation . | KNOWLEDGE-BASED SYSTEMS , 2023 , 263 .
Export to NoteExpress RIS BibTex
View-relation constrained global representation learning for multi-view-based 3D object recognition SCIE
期刊论文 | 2022 , 53 (7) , 7741-7750 | APPLIED INTELLIGENCE
WoS CC Cited Count: 3
Abstract&Keyword Cite

Abstract :

Multi-view observations provide complementary clues for 3D object recognition, but also include redundant information that appears different across views due to view-dependent projection, light reflection and self-occlusions. This paper presents a view-relation constrained global representation network (VCGR-Net) for 3D object recognition that can mitigate the view interference problem at all phases, from view-level source feature generation to multi-view feature aggregation. Specifically, we determine inter-view relations via LSTM implicitly. Based on the relations, we construct a two-stage feature selection module to filter features at each view according to their importance to the global representation and their reliability as observations at specific views. The selected features are then aggregated by referring to intra- and inter-view spatial context to generate global representation for 3D object recognition. Experiments on the ModelNet40 and ModelNet10 datasets demonstrate that the proposed method can suppress view interference and therefore outperform state-of-the-art methods in 3D object recognition.

Keyword :

3D object recognition 3D object recognition View-relation constraints View-relation constraints 3D global representation 3D global representation Multi-views Multi-views

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Xu, Ruchang , Mi, Qing , Ma, Wei et al. View-relation constrained global representation learning for multi-view-based 3D object recognition [J]. | APPLIED INTELLIGENCE , 2022 , 53 (7) : 7741-7750 .
MLA Xu, Ruchang et al. "View-relation constrained global representation learning for multi-view-based 3D object recognition" . | APPLIED INTELLIGENCE 53 . 7 (2022) : 7741-7750 .
APA Xu, Ruchang , Mi, Qing , Ma, Wei , Zha, Hongbin . View-relation constrained global representation learning for multi-view-based 3D object recognition . | APPLIED INTELLIGENCE , 2022 , 53 (7) , 7741-7750 .
Export to NoteExpress RIS BibTex
All-Higher-Stages-In Adaptive Context Aggregation for Semantic Edge Detection SCIE
期刊论文 | 2022 , 32 (10) , 6778-6791 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
WoS CC Cited Count: 12
Abstract&Keyword Cite

Abstract :

Convolutional Neural Networks (CNNs) can reveal local variation details and multi-scale spatial context in images via low-to-high stages of feature expression; effective fusion of these raw features is key to Semantic Edge Detection (SED). The methods available in the field generally fuse features across stages in a position-aligned mode, which cannot satisfy the requirements of diverse semantic context in categorizing different pixels. In this paper, we propose a deep framework for SED, the core of which is a new multi-stage feature fusion structure, called All-HiS-In ACA (All-Higher-Stages-In Adaptive Context Aggregation). All-HiS-In ACA can adaptively select semantic context from all higher-stages for detailed features via a cross-stage self-attention paradigm, and thus can obtain fused features with high-resolution details for edge localization and rich semantics for edge categorization. In addition, we develop a non-parametric Inter-layer Complementary Enhancement (ICE) module to supplement clues at each stage with their counterparts in adjacent stages. The ICE-enhanced multi-stage features are then fed into the All-HiS-In ACA module. We also construct an Object-level Semantic Integration (OSI) module to further refine the fused features by enforcing the consistency of the features within the same object. Extensive experiments demonstrate the superior performance of the proposed method over state-of-the-art works.

Keyword :

Aggregates Aggregates Horses Horses object-level semantic integration object-level semantic integration Semantics Semantics multi-stage feature fusion multi-stage feature fusion adaptive context aggregation adaptive context aggregation Semantic edge detection Semantic edge detection Feature extraction Feature extraction Open systems Open systems complementary feature enhancement complementary feature enhancement Image edge detection Image edge detection Image segmentation Image segmentation

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Bo, Qihan , Ma, Wei , Lai, Yu-Kun et al. All-Higher-Stages-In Adaptive Context Aggregation for Semantic Edge Detection [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2022 , 32 (10) : 6778-6791 .
MLA Bo, Qihan et al. "All-Higher-Stages-In Adaptive Context Aggregation for Semantic Edge Detection" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 32 . 10 (2022) : 6778-6791 .
APA Bo, Qihan , Ma, Wei , Lai, Yu-Kun , Zha, Hongbin . All-Higher-Stages-In Adaptive Context Aggregation for Semantic Edge Detection . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2022 , 32 (10) , 6778-6791 .
Export to NoteExpress RIS BibTex
Deep Facade Parsing with Occlusions SCIE
期刊论文 | 2022 , 16 (2) , 524-543 | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS
WoS CC Cited Count: 5
Abstract&Keyword Cite

Abstract :

Correct facade image parsing is essential to the semantic understanding of outdoor scenes. Unfortunately, there are often various occlusions in front of buildings, which fails many existing methods. In this paper, we propose an end-to-end deep network for facade parsing with occlusions. The network learns to decompose an input image into visible and invisible parts by occlusion reasoning. Then, a context aggregation module is proposed to collect nonlocal cues for semantic segmentation of the visible part. In addition, considering the regularity of man-made buildings, a repetitive pattern completion branch is designed to infer the contents in the invisible regions by referring to the visible part. Finally, the parsing map of the input facade image is generated by fusing the results of the visible and invisible results. Experiments on both synthetic and real datasets demonstrate that the proposed method outperforms state-of-the-art methods in parsing facades with occlusions. Moreover, we applied our method in applications of image inpainting and 3D semantic modeling.

Keyword :

Facade parsing Facade parsing repetitive pattern repetitive pattern man-made structure man-made structure occlusion occlusion

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ma, Wenguang , Ma, Wei , Xu, Shibiao . Deep Facade Parsing with Occlusions [J]. | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS , 2022 , 16 (2) : 524-543 .
MLA Ma, Wenguang et al. "Deep Facade Parsing with Occlusions" . | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS 16 . 2 (2022) : 524-543 .
APA Ma, Wenguang , Ma, Wei , Xu, Shibiao . Deep Facade Parsing with Occlusions . | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS , 2022 , 16 (2) , 524-543 .
Export to NoteExpress RIS BibTex
Multiview Feature Aggregation for Facade Parsing SCIE
期刊论文 | 2022 , 19 | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
WoS CC Cited Count: 8
Abstract&Keyword Cite

Abstract :

Facade image parsing is essential to the semantic understanding and 3-D reconstruction of urban scenes. Considering the occlusion and appearance ambiguity in single-view images and the easy acquisition of multiple views, in this letter, we propose a multiview enhanced deep architecture for facade parsing. The highlight of this architecture is a cross-view feature aggregation module that can learn to choose and fuse useful convolutional neural network (CNN) features from nearby views to enhance the representation of a target view. Benefitting from the multiview enhanced representation, the proposed architecture can better deal with the ambiguity and occlusion issues. Moreover, our cross-view feature aggregation module can be straightforwardly integrated into existing single-image parsing frameworks. Extensive comparison experiments and ablation studies are conducted to demonstrate the good performance of the proposed method and the validity and transportability of the cross-view feature aggregation module.

Keyword :

feature aggregation feature aggregation Three-dimensional displays Three-dimensional displays Image segmentation Image segmentation Training Training Data models Data models Facade parsing Facade parsing Feature extraction Feature extraction Computer architecture Computer architecture Semantics Semantics multiview multiview wide baseline wide baseline

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ma, Wenguang , Xu, Shibiao , Ma, Wei et al. Multiview Feature Aggregation for Facade Parsing [J]. | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2022 , 19 .
MLA Ma, Wenguang et al. "Multiview Feature Aggregation for Facade Parsing" . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 19 (2022) .
APA Ma, Wenguang , Xu, Shibiao , Ma, Wei , Zha, Hongbin . Multiview Feature Aggregation for Facade Parsing . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2022 , 19 .
Export to NoteExpress RIS BibTex
Progressive Feature Learning for Facade Parsing With Occlusions SCIE
期刊论文 | 2022 , 31 , 2081-2093 | IEEE TRANSACTIONS ON IMAGE PROCESSING
WoS CC Cited Count: 9
Abstract&Keyword Cite

Abstract :

Existing deep models for facade parsing often fail in classifying pixels in heavily occluded regions of facade images due to the difficulty in feature representation of these pixels. In this paper, we solve facade parsing with occlusions by progressive feature learning. To this end, we locate the regions contaminated by occlusions via Bayesian uncertainty evaluation on categorizing each pixel in these regions. Then, guided by the uncertainty, we propose an occlusion-immune facade parsing architecture in which we progressively re-express the features of pixels in each contaminated region from easy to hard. Specifically, the outside pixels, which have reliable context from visible areas, are re-expressed at early stages; the inner pixels are processed at late stages when their surroundings have been decontaminated at the earlier stages. In addition, at each stage, instead of using regular square convolution kernels, we design a context enhancement module (CEM) with directional strip kernels, which can aggregate structural context to re-express facade pixels. Extensive experiments on popular facade datasets demonstrate that the proposed method achieves state-of-the-art performance.

Keyword :

feature representation feature representation manmade structure manmade structure Context modeling Context modeling Convolutional neural networks Convolutional neural networks Training Training Facade parsing Facade parsing Representation learning Representation learning occlusion occlusion Uncertainty Uncertainty Buildings Buildings Bayes methods Bayes methods

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ma, Wenguang , Xu, Shibiao , Ma, Wei et al. Progressive Feature Learning for Facade Parsing With Occlusions [J]. | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2022 , 31 : 2081-2093 .
MLA Ma, Wenguang et al. "Progressive Feature Learning for Facade Parsing With Occlusions" . | IEEE TRANSACTIONS ON IMAGE PROCESSING 31 (2022) : 2081-2093 .
APA Ma, Wenguang , Xu, Shibiao , Ma, Wei , Zhang, Xiaopeng , Zha, Hongbin . Progressive Feature Learning for Facade Parsing With Occlusions . | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2022 , 31 , 2081-2093 .
Export to NoteExpress RIS BibTex
Image Manipulation Localization Using Attentional Cross-Domain CNN Features SCIE
期刊论文 | 2021 , 34 (9) , 5614-5628 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
WoS CC Cited Count: 81
Abstract&Keyword Cite

Abstract :

Along with the advancement of manipulation technologies, image modification is becoming increasingly convenient and imperceptible. To tackle the challenging image tampering detection problem, this article presents an attentional cross-domain deep architecture, which can be trained end-to-end. This architecture is composed of three convolutional neural network (CNN) streams to extract three types of features, including visual perception, resampling, and local inconsistency features, from spatial and frequency domains. The multitype and cross-domain features are then combined to formulate hybrid features to distinguish manipulated regions from nonmanipulated parts. Compared with other deep architectures, the proposed one spans a more complementary and discriminative feature space by integrating richer types of features from different domains in a unified end-to-end trainable framework and thus can better capture artifacts caused by different types of manipulations. In addition, we design and train a module called tampering discriminative attention network (TDA-Net) to highlight suspicious parts. These part-level representations are then integrated with the global ones to further enhance the discriminating capability of the hybrid features. To adequately train the proposed architecture, we synthesize a large dataset containing various types of manipulations based on DRESDEN and COCO. Experiments on four public datasets demonstrate that the proposed model can localize various manipulations and achieve the state-of-the-art performance. We also conduct ablation studies to verify the effectiveness of each stream and the TDA-Net module.

Keyword :

Location awareness Location awareness Attention model Attention model Convolutional neural networks Convolutional neural networks Frequency-domain analysis Frequency-domain analysis Feature extraction Feature extraction convolutional neural network (CNN) convolutional neural network (CNN) feature fusion feature fusion image forgery image forgery Streaming media Streaming media cross domain cross domain Deep learning Deep learning tamper localization tamper localization Transforms Transforms

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Li, Shuaibo , Xu, Shibiao , Ma, Wei et al. Image Manipulation Localization Using Attentional Cross-Domain CNN Features [J]. | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 , 34 (9) : 5614-5628 .
MLA Li, Shuaibo et al. "Image Manipulation Localization Using Attentional Cross-Domain CNN Features" . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 34 . 9 (2021) : 5614-5628 .
APA Li, Shuaibo , Xu, Shibiao , Ma, Wei , Zong, Qiu . Image Manipulation Localization Using Attentional Cross-Domain CNN Features . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 , 34 (9) , 5614-5628 .
Export to NoteExpress RIS BibTex
Pyramid ALKNet for Semantic Parsing of Building Facade Image SCIE
期刊论文 | 2021 , 18 (6) , 1009-1013 | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
WoS CC Cited Count: 23
Abstract&Keyword Cite

Abstract :

The semantic parsing of building facade images is a fundamental yet challenging task in urban scene understanding. Existing works sought to tackle this task by using facade grammars or convolutional neural networks (CNNs). The former can hardly generate parsing results coherent with real images while the latter often fails to capture relationships among facade elements. In this letter, we propose a pyramid atrous large kernel (ALK) network (ALKNet) for the semantic segmentation of facade images. The pyramid ALKNet captures long-range dependencies among building elements by using ALK modules in multiscale feature maps. It makes full use of the regular structures of facades to aggregate useful nonlocal context information and thereby is capable of dealing with challenging image regions caused by occlusions, ambiguities, and so on. Experiments on both rectified and unrectified facade data sets show that ALKNet has better performances than those of state-of-the-art methods.

Keyword :

large kernel large kernel man-made structure man-made structure Semantics Semantics Image segmentation Image segmentation Kernel Kernel nonlocal context nonlocal context Shape Shape Measurement Measurement Task analysis Task analysis Facade parsing Facade parsing Buildings Buildings

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ma, Wenguang , Ma, Wei , Xu, Shibiao et al. Pyramid ALKNet for Semantic Parsing of Building Facade Image [J]. | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2021 , 18 (6) : 1009-1013 .
MLA Ma, Wenguang et al. "Pyramid ALKNet for Semantic Parsing of Building Facade Image" . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 18 . 6 (2021) : 1009-1013 .
APA Ma, Wenguang , Ma, Wei , Xu, Shibiao , Zha, Hongbin . Pyramid ALKNet for Semantic Parsing of Building Facade Image . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2021 , 18 (6) , 1009-1013 .
Export to NoteExpress RIS BibTex
10| 20| 50 per page
< Page ,Total 3 >

Export

Results:

Selected

to

Format:
Online/Total:1214/10340964
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.