AGI-P: A Gender Identification Framework for Authorship Analysis Using Customized Fine-Tuning of Multilingual Language Model - Details

Author：

Indexed by：

EI Scopus SCIE

Abstract：

In　this　investigation,　we　propose　a　solution　for　the　author＇s　gender　identification　task　called　AGI-P.　This　task　has　several　real-world　applications　across　different　fields,　such　as　marketing　and　advertising,　forensic　linguistics,　sociology,　recommendation　systems,　language　processing,　historical　analysis,　education,　and　language　learning.　We　created　a　new　dataset　to　evaluate　our　proposed　method.　The　dataset　is　balanced　in　terms　of　gender　using　a　random　sampling　method　and　consists　of　1944　samples　in　total.　We　use　accuracy　as　an　evaluation　measure　and　compare　the　performance　of　the　proposed　solution　(AGI-P)　against　state-of-the-art　machine　learning　classifiers　and　fine-tuned　pre-trained　multilingual　language　models　such　as　DistilBERT,　mBERT,　XLM-RoBERTa,　and　Multilingual　DEBERTa.　In　this　regard,　we　also　propose　a　customized　fine-tuning　strategy　that　improves　the　accuracy　of　the　pre-trained　language　models　for　the　author　gender　identification　task.　Our　extensive　experimental　studies　reveal　that　our　solution　(AGI-P)　outperforms　the　well-known　machine　learning　classifiers　and　fine-tuned　pre-trained　multilingual　language　models　with　an　accuracy　level　of　92.03%.　Moreover,　the　pre-trained　multilingual　language　models,　fine-tuned　with　the　proposed　customized　strategy,　outperform　the　fine-tuned　pre-trained　language　models　using　an　out-of-the-box　fine-tuning　strategy.　The　codebase　and　corpus　can　be　accessed　on　our　GitHub　page　at:　https://github.com/mumairhassan/AGI-P

Keyword：

Business analytics tourism industry gender identification language models

Author Community：

[ 1 ] [Sarwar, Raheem]Manchester Metropolitan Univ, Dept Operat Events & Hospitality Management, Manchester M15 6BH, Lancs, England
[ 2 ] [Teh, Pin Shen]Manchester Metropolitan Univ, Dept Operat Events & Hospitality Management, Manchester M15 6BH, Lancs, England
[ 3 ] [An Ha, Le]Univ Wolverhampton, Res Grp Computat Linguist, RIILP, Wolverhampton WV1 1LY, England
[ 4 ] [Sabah, Fahad]Beijing Univ Technol, Fac Informat Technol, Beijing 100021, Peoples R China
[ 5 ] [Nawaz, Raheel]Staffordshire Univ, Execut Off, Stoke On Trent ST4 2DE, England
[ 6 ] [Hameed, Ibrahim A.]Norwegian Univ Sci & Technol, Dept ICT & Nat Sci, N-6009 Alesund, Norway
[ 7 ] [Hassan, Muhammad Umair]Norwegian Univ Sci & Technol, Dept ICT & Nat Sci, N-6009 Alesund, Norway

Reprint Author's Address：

[Hassan, Muhammad Umair]Norwegian Univ Sci & Technol, Dept ICT & Nat Sci, N-6009 Alesund, Norway;;

Email：

muhammad.u.hassan@ntnu.no

Show more details

Related Keywords：

Tourism industry and economic growth nexus in Beijing, China
2019，Economies
Analysis of the spatial impact of the tourism industry on Vietnam's economic growth
2020，African Journal of Hospitality, Tourism and Leisure
Spatial-temporal evolution and influencing factors of tourism eco-efficiency in China's Beijing-Tianjin-Hebei region
2022，FRONTIERS IN ENVIRONMENTAL SCIENCE
ChatLLM network: More brains, more intelligence
2025，AI Open

Source ：

IEEE ACCESS

ISSN： 2169-3536

Year： 2024

Volume： 12

Page： 15399-15409

3 . 9 0 0

JCR@2022

Cited Count：

WoS CC Cited Count： 1

SCOPUS Cited Count： 4

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 5

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to