RGB-D Dual Modal Information Complementary Semantic Segmentation Network; [RGB-D 双模态信息互补的语义分割网络] - Details

Author：

Wang, L. (Wang, L..) | Gu, N. (Gu, N..) | Xin, J. (Xin, J..) | Wang, S. (Wang, S..)

Indexed by：

EI Scopus

Abstract：

In　order　to　fully　fuse　RGB　and　depth　information　to　further　improve　the　accuracy　of　semantic　segmentation,　attention　mechanism　is　introduced　to　realize　the　complementary　fusion　of　RGB　and　depth　modal　features.　The　proposed　RGB-D　dual　modal　information　complementary　semantic　segmentation　network　is　designed　based　on　encoder-decoder　framework,　in　which　the　encoder　adopts　double　branch　network　structure　to　extract　the　feature　map　of　RGB　image　and　depth　image　respectively,　and　the　decoder　adopts　the　structure　of　layer-by-layer　skip　connection　to　gradually　integrate　semantic　information　with　different　granularity　to　realize　pixel-level　semantic　classification.　For　the　features　leaned　in　the　lower　layer,　the　encoder　utilizes　an　RGB-D　information　complementary　module　to　mutually　fuse　the　feature　from　one　modal　to　the　other　modal.　The　RGB-D　information　complementary　module　includes　two　kinds　of　attentions,　Depth-guided　Attention　Module　(Depth-AM)　and　RGB-guided　Attention　Module　(RGB-AM).　The　Depth-AM　takes　the　original　depth　information　as　the　supplement　of　RGB　features　to　solve　the　problem　of　inaccurate　RGB　features　caused　by　illumination　changes,　and　the　RGB-AM　takes　the　RGB　feature　as　the　supplementary　information　of　depth　feature　to　solve　the　problem　of　inaccurate　depth　feature　caused　by　the　lack　of　object　texture　information.　Under　the　condition　of　utilizing　backbone　with　same　structure,　compared　with　RDF-Net,　the　proposed　RGB-D　dual　modal　information　complementary　semantic　segmentation　network　has　obvious　improvements.　In　details,　the　mIoU,　pixel　accuracy　and　mean　pixel　are　improved　by　1.8%,　0.5%　and　0.7%　on　SUNRGB-D　dataset,　the　mIoU,　pixel　accuracy　and　mean　pixel　are　improved　by　1.8%,　1.3%　and　1.9%　on　NYUv2　dataset.　©　2023　Institute　of　Computing　Technology.　All　rights　reserved.

Keyword：

encoder-decoder deep learning attention mechanism RGB-D information complementary RGB-D semantic segmentation

Author Community：

[ 1 ] [Wang L.]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 2 ] [Gu N.]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 3 ] [Xin J.]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 4 ] [Wang S.]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Long-Term Traffic Prediction Based on LSTM Encoder-Decoder Architecture
2021，IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Hybrid Prediction for Water Quality with Bidirectional LSTM and Temporal Attention
2022，
Online Handwritten Mathematical Expression Recognition Based on Dual-mode Encoder-decoder Framework; [基于双模编码器- 解码器框架的联机手写数学公式识别]
2024，Journal of Beijing University of Technology
Hybrid Water Quality Prediction with Bidirectional Long Short-Term Memory and Encoder-Decoder
2022，

Source ：

Journal of Computer-Aided Design and Computer Graphics

ISSN： 1003-9775

Year： 2023

Issue： 10

Volume： 35

Page： 1489-1499

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 11

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to