• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Liu, Bo (Liu, Bo.) | Su, Zhuo (Su, Zhuo.) | Qu, Guangzhi (Qu, Guangzhi.)

Indexed by:

SCIE

Abstract:

In recent years, a large number of Chinese electronic texts have been produced in the process of information construction in various fields. Identifying specific entities in these electronic texts has become a major research focus. Most existing research methods use radicals to extract the glyph features of Chinese characters but have seen its limitation. This paper extracts the features of Chinese characters from three aspects: glyph features, phonetic features, and character features, and improves conventional feature extraction methods for each kind of feature. A new named entity recognition method (AIP) is proposed by transforming Chinese characters into corresponding images for glyph feature extraction, dividing pinyin into initials, vowels, and tones for phonetic feature extraction, and fine-tuning the A Lite Bert model for character feature extraction to improve the performance of the model. This paper compares the performance of the AIP model and mainstream neural network models on Chinese named entity recognition tasks on commonly used data sets and the data sets in specific domains. The results showed that AIP achieved better results than the related work. The F1 values on the two data sets are 94.4% and 80.5%, respectively, which validates the model's versatility.

Keyword:

phonetic features glyph features Named entity recognition

Author Community:

  • [ 1 ] [Liu, Bo]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Su, Zhuo]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Liu, Bo]Massey Univ, Sch Math & Computat Sci, Palmerston North 4472, New Zealand
  • [ 4 ] [Qu, Guangzhi]Oakland Univ, Comp Sci & Engn Dept, Rochester, MI 48309 USA

Reprint Author's Address:

Show more details

Related Keywords:

Related Article:

Source :

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING

ISSN: 2375-4699

Year: 2022

Issue: 6

Volume: 21

2 . 0

JCR@2022

2 . 0 0 0

JCR@2022

JCR Journal Grade:4

CAS Journal Grade:4

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 8

Affiliated Colleges:

Online/Total:2304/10899953
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.