Research on Named Entity Recognition in Chinese EMR Based on Semi-Supervised Learning with Dual Selected Strategy - Details

Author：

Indexed by：

EI Scopus

Abstract：

With　the　construction　of　the　electronic　medical　record　system,　medical　record　data　begins　to　accumulate,　and　how　to　extract　essential　information　from　these　resources　has　become　a　concern.　And　named　entity　recognition(NER)　is　the　first　step.　With　the　help　of　doctors,　we　built　a　small　Chinese　electronic　medical　record　annotation　corpus.　But　the　NER　supervision　method　requires　a　large　amount　of　manually　labeled　corpus.　So　to　reduce　the　cost　of　it　and　make　better　use　of　the　unlabeled　corpus,　this　paper　proposes　a　semi-supervised　Chinese　electronic　medical　record　NER　model　based　on　ALBERT-BiLSTM-CRF　which　named　CEMRNER.　The　model　uses　a　Bidirectional　Long　Short　Term　Memory　network　(BiLSTM)　and　a　Conditional　Random　Field　model　(CRF)　to　train　the　data　and　introduces　the　pre-training　language　model　ALBERT　to　solve　the　problem　of　Chinese　representation.　At　the　same　time,　we　propose　a　dual　selected　strategy　to　select　the　high　confidence　samples　and　expand　the　training　set.　The　dual　strategy　can　ensure　the　accuracy　i　automatically　labeled　data,　and　reduce　the　error　iteration　in　semi-supervised　learning.　The　experiment　and　analysis　show　that　compared　with　other　models,　this　method　is　more　accurate　and　comprehensive.　The　precision,　recall　rate,　and　F1Score　are　85.45%,　87.81%,　and　86.61%,　respectively.　The　paper　proves　that　using　a　semi-supervised　method　and　pre-training　ALBERT　can　improve　the　accuracy　of　recognition　under　the　condition　of　less　labeled　data.　©　2020　ACM.

Keyword：

Medical information systems Iterative methods Medical computing Learning algorithms Supervised learning Random processes Medical informatics

Author Community：

[ 1 ] [Yan, Jianzhuo]Beijing University of Technology, China
[ 2 ] [Geng, Yanan]Beijing University of Technology, China
[ 3 ] [Xu, Hongxia]Beijing University of Technology, China
[ 4 ] [Yu, Yongchuan]Beijing University of Technology, China
[ 5 ] [Tan, Shaofeng]Beijing University of Technology, China
[ 6 ] [He, Dongdong]Beijing University of Technology, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Chinese Medical Event Extraction Based on Hybrid Neural Network
2022，46th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2022
Efficient Medical Big Data Management With Keyword-Searchable Encryption in Healthchain
2022，IEEE Systems Journal
A DWT-Utilized Classifier for UPJO Diagnosis Using Ultrasound Images
2022，
Medical Dialogue Generation via Extracting Heterogenous Information
2022，24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022

Source ：

Year： 2020

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 5

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to