Indexed by:
Abstract:
The complex two-dimensional structure poses huge challenges for handwritten mathematical expression recognition (HMER). Many researchers process the LaTeX sequence into a tree structure and then design tree decoders based on RNN to address this issue. However, RNNs have problems with long-term dependency due to their structural characteristics. Although Transformers solve the long-term dependency problem, tree decoders based on Transformers are rarely used for HMER because the attention coverage is significantly insufficient when the distance between parent and child nodes is large in tree structures. In this paper, we propose a novel offline HMER model SATD incorporating a tree decoder based on Transformer to learn the implicit structural relationships in LaTeX strings. Moreover, to address the issue of distant parent–child nodes, we introduce a multi-scale attention aggregation module to refine attention weights using contextual information with different receptive fields. Experiments on CROHME2014/2016/2019 and HME100K datasets demonstrate performance improvements, achieving accuracy rates of 63.45%/60.42%/61.05% on the CROHME 2014/2016/2019 test sets. The source code https://github.com/EnderXiao/SATD/ of this work will be publicly available. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
Keyword:
Reprint Author's Address:
Email:
Source :
Visual Computer
ISSN: 0178-2789
Year: 2024
Issue: 2
Volume: 41
Page: 883-900
3 . 5 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 6
Affiliated Colleges: