SATD: syntax-aware handwritten mathematical expression recognition based on tree-structured transformer decoder - Details

Author：

Fu, P. (Fu, P..) | Xiao, G. (Xiao, G..) | Yang, H. (Yang, H..)

Indexed by：

EI Scopus SCIE

Abstract：

The　complex　two-dimensional　structure　poses　huge　challenges　for　handwritten　mathematical　expression　recognition　(HMER).　Many　researchers　process　the　LaTeX　sequence　into　a　tree　structure　and　then　design　tree　decoders　based　on　RNN　to　address　this　issue.　However,　RNNs　have　problems　with　long-term　dependency　due　to　their　structural　characteristics.　Although　Transformers　solve　the　long-term　dependency　problem,　tree　decoders　based　on　Transformers　are　rarely　used　for　HMER　because　the　attention　coverage　is　significantly　insufficient　when　the　distance　between　parent　and　child　nodes　is　large　in　tree　structures.　In　this　paper,　we　propose　a　novel　offline　HMER　model　SATD　incorporating　a　tree　decoder　based　on　Transformer　to　learn　the　implicit　structural　relationships　in　LaTeX　strings.　Moreover,　to　address　the　issue　of　distant　parent–child　nodes,　we　introduce　a　multi-scale　attention　aggregation　module　to　refine　attention　weights　using　contextual　information　with　different　receptive　fields.　Experiments　on　CROHME2014/2016/2019　and　HME100K　datasets　demonstrate　performance　improvements,　achieving　accuracy　rates　of　63.45%/60.42%/61.05%　on　the　CROHME　2014/2016/2019　test　sets.　The　source　code　https://github.com/EnderXiao/SATD/　of　this　work　will　be　publicly　available.　©　The　Author(s),　under　exclusive　licence　to　Springer-Verlag　GmbH　Germany,　part　of　Springer　Nature　2024.

Keyword：

Transformer Attention Tree decoder Offline handwritten mathematical expression recognition Coverage attention

Author Community：

[ 1 ] [Fu P.]Faculty of Information Technology, Beijing university of technology, Xidawang Road, Beijing, 100124, China
[ 2 ] [Xiao G.]Faculty of Information Technology, Beijing university of technology, Xidawang Road, Beijing, 100124, China
[ 3 ] [Yang H.]Faculty of Information Technology, Beijing university of technology, Xidawang Road, Beijing, 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

DSWA: A Dilated Shift Window Attention Method for Chinese Named Entity Recognition
2023，
Relation constraint self-attention for image captioning
2022，NEUROCOMPUTING
When zero-padding position encoding encounters linear space reduction attention: an efficient semantic segmentation Transformer of remote sensing images
2024，International Journal of Remote Sensing
LAS-Transformer: An Enhanced Transformer Based on the Local Attention Mechanism for Speech Recognition
2022，INFORMATION

Source ：

Visual Computer

ISSN： 0178-2789

Year： 2024

Issue： 2

Volume： 41

Page： 883-900

3 . 5 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 6

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to