Indexed by:
Abstract:
Many science, technology and innovation (STI) resources are attached with several different labels, such as IPC and CPC for patents, and PACS (Physics and Astronomy Classification Scheme) numbers for scientific publications. This problem is well known as the multi-label classification. Though there are a number of approaches and open-source tools for this task in the literature that work well on benchmark datasets, real-world is more complex in terms of both the number and hierarchy of labels. This work aims to compare comprehensively the performance of three state-of-the-art tools, Dependency LDA, Scikit-Multilearn and Neural Classifier on Scigraph of academic resource data. It is found that Neural Classifier works better on an unbalanced distribution dataset with more complex hierarchical structure and a larger number of label scale in terms of Micro F1, Micro F1 and Hamming Loss than the other two tools. On the basis of our comparisons, several directions are suggested in the near future. © 2020 ACM.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2020
Page: 8-12
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: