Feature enhanced spherical transformer for spherical image compression - Details

Author：

Hu, Hui (Hu, Hui.) | Shi, Yunhui (Shi, Yunhui.) | Wang, Jin (Wang, Jin.) | Ling, Nam (Ling, Nam.) | Yin, Baocai (Yin, Baocai.)

Indexed by：

EI Scopus SCIE

Abstract：

It　is　well　known　that　the　wide　field　of　view　of　spherical　images　requires　high　resolution,　which　increases　the　challenges　of　storage　and　transmission.　Recently,　a　spherical　learning-based　image　compression　method　called　OSLO　has　been　proposed,　which　leverages　HEALPix＇s　approximately　uniform　spherical　sampling.　However,　HEALPix　sampling　can　only　utilize　a　fixed　3　x　3　convolution　kernel,　resulting　in　a　limited　receptive　field　and　an　inability　to　capture　non-local　information.　This　limitation　hinders　redundancy　removal　during　the　transform　and　texture　synthesis　during　the　inverse　transform.　To　address　this　issue,　we　propose　a　featureenhanced　spherical　Transformer-based　image　compression　method　that　leverages　HEALPix＇s　hierarchical　structure.　Specifically,　to　reduce　the　computational　complexity　of　the　Transformer＇s　attention　mechanism,　we　divide　the　sphere　into　multiple　windows　using　HEALPix＇s　hierarchical　structure　and　compute　attention　within　these　spherical　windows.　Since　there　is　no　communication　between　adjacent　windows,　we　introduce　spherical　convolution　to　aggregate　information　from　neighboring　windows　based　on　their　local　correlation.　Additionally,　to　enhance　the　representational　ability　of　features,　we　incorporate　an　inverted　residual　bottleneck　module　for　feature　embedding　and　a　feedforward　neural　network.　Experimental　results　demonstrate　that　our　method　outperforms　OSLO,　achieving　lower　codec　time.

Keyword：

Spherical image compression Neural network Feature enhancement Spherical transformer

Author Community：

[ 1 ] [Hu, Hui]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 2 ] [Shi, Yunhui]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 3 ] [Wang, Jin]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 4 ] [Yin, Baocai]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[ 5 ] [Ling, Nam]Santa Clara Univ, 500 Camino Real, Santa Clara, CA 95053 USA

Reprint Author's Address：

[Wang, Jin]Beijing Univ Technol, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

Email：

ijinwang@bjut.edu.cn

Show more details

Related Keywords：

Chinese short text classification method based on word embedding and Long Short-Term Memory Neural Network
2021，2021 International Conference on Artificial Intelligence, Big Data and Algorithms, CAIBDA 2021
Efficient Spatio-Temporal Feature Extraction Recurrent Neural Network for Video Deblurring
2023，Journal of Computer-Aided Design and Computer Graphics
Application of non-Gaussian feature enhancement extraction in gated recurrent neural network for fault detection in batch production processes
2024，Expert Systems with Applications
Brain functional connections classification method based on significant sparse strong correlation; [基于显著稀疏强关联的脑功能连接分类方法]
2022，Journal of Zhejiang University (Engineering Science)

Source ：

DISPLAYS

ISSN： 0141-9382

Year： 2025

Volume： 88

4 . 3 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 14

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to