Indexed by:
Abstract:
The most current Chinese named entity recognition models can be divided into the following two types: one is based on character unit and the other is based on word unit. Both of them have their disadvantages. The former lacks semantic and boundary information of words, while the latter depends on the accuracy of word segmentation. In addition, Chinese characters evolved from hieroglyphics, which contain rich semantic information. To address the above problems, this paper proposes a Chinese named entity recognition model with the multi-granularity embedding from Chinese strokes, characters, and words. The method first obtains the stroke list corresponding to Chinese characters in the training corpus, and uses a convolutional kernel with multi-window size to obtain the n-gram information of strokes. Secondly, a convolutional neural network is used to obtain the local information of the sequence and the representation of Chinese characters within the local area. Finally, considering the constraint relationship between the information of the context and the label, the BiLSTM network is used for context modeling, and the CRF network is used as the label decoding layer. The experimental results show that the F1-Score of this method achieves 62.41% and 95.28% on Weibo and MSRA datasets, respectively. © 2022 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 2693-2865
Year: 2022
Volume: 2022-June
Page: 969-973
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 12
Affiliated Colleges: