Indexed by:
Abstract:
Biomedical named entity recognition (BNER) is a critical task for biomedical information extraction. Most popular BNER approaches based on deep learning utilize words and characters as features to represent medical texts. However, many medical terminologies are composed of multiple words and characters, and splitting medical terminology into multiple words (or characters) and assigning weight values for each word (or character) by a standard attention mechanism may disperse the attention score and result in a lower weight value for the medical terminology. This paper proposes a Dictionary-guided Attention Network (DGAN) for BNER in Chinese electronic medical records (EMRs). First, the medical concepts are extracted as large-size words to supplement the comprehensive semantic information of the medical terminology by matching the EMR text to the biomedical dictionary. Then, based on the matched dictionary results, an optimized attention strategy is proposed to focus on the medical concept and adaptively assign higher weights to the characters contained in a concept. Furthermore, semisupervised learning is introduced to reduce the manual labeling of data and to handle the entities not defined in the medical dictionary. To validate our new model in recognizing biomedical named entities, we conduct comprehensive experiments on a real-world Chinese EMR dataset and the CCKS2017 dataset. Our promising results illustrate that our method not only achieves a state-of-the-art performance in BNER but also reduces manual data annotation. © 2023 Elsevier Ltd
Keyword:
Reprint Author's Address:
Email:
Source :
Expert Systems with Applications
ISSN: 0957-4174
Year: 2023
Volume: 231
8 . 5 0 0
JCR@2022
ESI Discipline: ENGINEERING;
ESI HC Threshold:19
Cited Count:
SCOPUS Cited Count: 18
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: