Biomedical-domain pre-trained language model for extractive summarization - Details

Author：

Du, Yongping (Du, Yongping.) (Scholars：杜永萍) | Li, Qingxiao (Li, Qingxiao.) | Wang, Lulin (Wang, Lulin.) | He, Yanqing (He, Yanqing.)

Indexed by：

EI Scopus SCIE

Abstract：

In　recent　years,　the　performance　of　deep　neural　network　in　extractive　summarization　task　has　been　improved　significantly　compared　with　traditional　methods.　However,　in　the　field　of　biomedical　extractive　summarization,　existing　methods　cannot　make　good　use　of　the　domain-aware　external　knowledge;　furthermore,　the　document　structural　feature　is　omitted　by　existing　deep　neural　network　model.　In　this　paper,　we　propose　a　novel　model　called　BioBERTSum　to　better　capture　token-level　and　sentence-level　contextual　representation,　which　uses　a　domain-aware　bidirectional　language　model　pre-trained　on　large-scale　biomedical　corpora　as　encoder,　and　further　fine-tunes　the　language　model　for　extractive　text　summarization　task　on　single　biomedical　document.　Especially,　we　adopt　a　sentence　position　embedding　mechanism,　which　enables　the　model　to　learn　the　position　information　of　sentences　and　achieve　the　structural　feature　of　document.　To　the　best　of　our　knowledge,　this　is　the　first　work　to　use　the　pre-trained　language　model　and　fine-tuning　strategy　for　extractive　summarization　task　in　the　biomedical　domain.　Experiments　on　PubMed　dataset　show　that　our　proposed　model　outperforms　the　recent　SOTA　(state-of-the-art)　model　by　ROUGE-1/2/L.　(C)　2020　Elsevier　B.V.　All　rights　reserved.

Keyword：

Document representation Pre-trained language model Fine-tuning Extractive biomedical summarization

Author Community：

[ 1 ] [Du, Yongping]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 2 ] [Li, Qingxiao]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 3 ] [Wang, Lulin]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 4 ] [He, Yanqing]Inst Sci & Tech Informat China, Beijing 100038, Peoples R China

Reprint Author's Address：

[He, Yanqing]Inst Sci & Tech Informat China, Beijing 100038, Peoples R China

Email：

ypdu@bjut.edu.cn |
lqx_bjut@163.com |
linwang2048@163.com |
heyanqingbjut@163.com

Show more details

Related Keywords：

Language model as an Annotator: Unsupervised context-aware quality phrase generation
2023，KNOWLEDGE-BASED SYSTEMS
Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
2023，17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023
Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
2023，
Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
2023，17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023

Source ：

KNOWLEDGE-BASED SYSTEMS

ISSN： 0950-7051

Year： 2020

Volume： 199

8 . 8 0 0

JCR@2022

ESI Discipline： COMPUTER SCIENCE;

ESI HC Threshold：132

Cited Count：

WoS CC Cited Count： 36

SCOPUS Cited Count： 52

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

信息科学技术学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to