Indexed by:
Abstract:
Feature extraction is essential for text classification. In this paper we discussed the basic ideas behind word-clustering-based feature extraction. Then a text classification method for feature extraction by the means of words clustering was presented. It employed an improved tree-structured growing self-organization map (TGSOM) to carry out word clustering. Also a new formula for calculating weights was developed by taking account of the distinction between clustered word features and plain word features. Finally, the SPRINT decision tree was applied to complete the text classification. Experiments showed that the precision of text classification using the proposed method is improved by 4.32%.
Keyword:
Reprint Author's Address:
Email:
Source :
Journal of Harbin Engineering University
ISSN: 1006-7043
Year: 2008
Issue: 11
Volume: 29
Page: 1205-1209
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: