Self-Supervised Visual Odometry Based on Scene Appearance-Structure Incremental Fusion - Details

Author：

Fu, Fuji (Fu, Fuji.) | Yang, Jinfu (Yang, Jinfu.) | Ma, Jiaqi (Ma, Jiaqi.) | Zhang, Jiahui (Zhang, Jiahui.)

Indexed by：

Scopus SCIE

Abstract：

Self-supervised　visual　odometry　(VO)　has　exhibited　remarkable　benefits　over　supervised　methods,　surpassing　the　reliance　on　the　annotated　ground-truth　of　training　data.　However,　most　existing　self-supervised　VO　methods,　namely　scene　appearance-based　methods,　have　limitations　in　exploiting　the　complementary　properties　of　cross-modal　information　between　scene　appearance　and　structure.　To　this　end,　we　propose　a　novel　self-supervised　VO　based　on　scene　appearance-structure　incremental　fusion　scheme.　Specifically,　a　Global-Local　Context　awareness-based　Depth　estimation　Network　(GLC-DN)　is　designed　to　introduce　the　scene　structural　cues,　thus　laying　the　foundation　for　realizing　the　scene　appearance-structure　incremental　fusion.　Then,　a　Dual　stream　Pose　estimation　Network　based　on　Scene　Appearance-Structure　Incremental　Fusion　(SASIF-DPN)　is　devised,　which　consists　of　a　Dual　Stream　Network　(DSN)　and　multiple　Cross-Modal　Complementary　Fusion　Modules　(CM-CFMs).　CM-CFM　fully　leverages　the　complementary　properties　between　the　RGB　information　and　the　predicted　depth　information,　and　the　combination　of　multiple　CM-CFMs　facilitates　the　information　interaction　between　the　two　modalities　in　an　incremental　fusion　manner.　Detailed　evaluations　of　GLC-DN　and　SASIF-DPN　provably　confirm　the　effectiveness　and　design　principles　of　each　component　we　propose.　Extensive　comparison　experiments　have　also　been　conducted,　which　clearly　verify　the　superiority　of　our　method　compared　to　current　counterparts.

Keyword：

Self-supervised learning Visual odometry Robustness depth estimation cross-modal incremental fusion Cameras Transformers pose estimation Training Intelligent transportation systems Self-supervised visual odometry Vectors Depth measurement Pose estimation

Author Community：

[ 1 ] [Fu, Fuji]Beijing Univ Sci & Technol, Sch Informat Engn, Beijing 100124, Peoples R China
[ 2 ] [Ma, Jiaqi]Beijing Univ Sci & Technol, Sch Informat Engn, Beijing 100124, Peoples R China
[ 3 ] [Zhang, Jiahui]Beijing Univ Sci & Technol, Sch Informat Engn, Beijing 100124, Peoples R China
[ 4 ] [Yang, Jinfu]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
[ 5 ] [Yang, Jinfu]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China

Reprint Author's Address：

[Yang, Jinfu]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing 100124, Peoples R China;;[Yang, Jinfu]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China

Email：

fufj@emails.bjut.edu.cn |
jfyang@bjut.edu.cn |
majiaqi@emails.bjut.edu.cn |
zhangjiahui_2021@163.com

Show more details

Related Keywords：

Pose Estimation Method Based on Self-supervised Recurrent Convolutional Neural Networks; [基于自监督循环卷积神经网络的位姿估计方法]
2021，Journal of Beijing University of Technology
Ssman: self-supervised masked adaptive network for 3D human pose estimation
2024，MACHINE VISION AND APPLICATIONS
Monocular Depth Estimation Method Based on Dual-discriminator Generative Adversarial Networks [基于双鉴别器生成对抗网络的单目深度估计方法]
2022，Journal of Beijing University of Technology
SCVO: Scale-Consistent Depth and Pose for Unsupervised Visual Odometry
2022，2022 41ST CHINESE CONTROL CONFERENCE (CCC)

Source ：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

ISSN： 1524-9050

Year： 2025

8 . 5 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to