Chinese Word Segmentation in Flectronic Medical Record Text via Graph Neural Network-Bidirectional LSTM-CRF Model - Details

Author：

Du, Jinlian (Du, Jinlian.) | Mi, Wei (Mi, Wei.) | Du, Xiaolin (Du, Xiaolin.)

Indexed by：

CPCI-S EI Scopus

Abstract：

Electronic　medical　record　(EMR)　text　word　segmentation　is　the　basis　of　natural　language　processing　in　medicine.　Due　to　the　characteristics　of　EMR,　such　as　strong　specialization,　high　cost　of　annotation,　special　writing　style　and　sustained　growth　of　terminology,　the　current　Chinese　word　segmentation　(CWS)　methods　cannot　fully　meet　the　requirements　of　the　application　of　EMR.　In　order　to　solve　this　problem,　an　EMR　word　segmentation　model　based　on　Graph　Neural　Network　(GNN),　bidirectional　Long　Short-Term　Memory　network　(Bi-LSTM)　and　conditional　random　field　(CRF)　is　designed　in　this　paper　to　improve　the　segmentation　effect　and　reduce　the　dependence　on　data　set.　In　the　model,　GNN　based　on　the　domain　lexicon　is　used　to　learn　the　local　composition　features,　Bi-LSTM　is　used　to　capture　the　long-term　dependence　and　context　sequence　information,　and　CRF　is　used　to　obtain　the　optimal　annotation　sequence　based　on　the　sentence　level　label　information.　Through　multi-feature　interaction,　the　ambiguity　resolution　and　new　word　recognition　in　the　EMR　word　segmentation　are　effectively　carried　out.　Compared　with　CWS　tools　such　as　Jieba　and　Pkuseg,　as　well　as　baseline　models　and　state-of-the-art　methods,　the　precision　and　recall　rate　of　the　model　in　this　paper　have　been　significantly　improved.

Keyword：

Deep Learning CWS GNN EMR

Author Community：

[ 1 ] [Du, Jinlian]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China
[ 2 ] [Mi, Wei]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China
[ 3 ] [Du, Xiaolin]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China

Reprint Author's Address：

[Mi, Wei]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China

Email：

dujinlian@bjut.edu.cn |
wmi@emails.bjut.edu.cn |
du_xiaolin@bjut.edu.cn

Show more details

Related Keywords：

基于深度学习的服饰属性标签识别技术
2020，计算机科学与应用
基于深度置信网络的入侵检测研究
2018，计算机科学与应用
LPI-MAM：以miRNAs为中介基于深度学习预测lncRNA-蛋白质相互作用
2023，计算生物学
ADLER-MRI: Adaptive Deep Learning for Enhanced MRI Reconstruction with Noise-Resilient Models
2024，

Source ：

2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE

ISSN： 2156-1125

Year： 2020

Page： 985-989

Language： English

Cited Count：

WoS CC Cited Count： 6

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to