• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Yang, Zhen (Yang, Zhen.) (Scholars:杨震) | Lei, Jianjun (Lei, Jianjun.) | Wang, Jian (Wang, Jian.) | Zhang, Xing (Zhang, Xing.) | Guo, Jun (Guo, Jun.)

Indexed by:

CPCI-S

Abstract:

As a simple classification method VSM has been widely applied in teat information processing field. There are some problems for traditional VSM to select a refined vector model representation, which can make a good tradeoff between complexity and performance, especially for incremental text mining. To solve these problems, in this paper, several improvements, such as VSM based on improved TF, TFIDF and BM25, are discussed. And then maximum mutual information feature selection is introduced to achieve a low dimension VSM with less complexity, and at the same time keep an acceptable precision. The experimental results of span filtering and short messages classification shows that the algorithm can achieve higher precision than existing algorithms under same conditions.

Keyword:

Incremental Text Classification VSM Spam Filtering Short Messages Classification

Author Community:

  • [ 1 ] [Yang, Zhen]Beijing Univ Technol, Sch Comp, Beijing 100022, Peoples R China
  • [ 2 ] [Lei, Jianjun]Beijing Univ Technol, Sch Comp, Beijing 100022, Peoples R China
  • [ 3 ] [Zhang, Xing]Beijing Univ Technol, Sch Comp, Beijing 100022, Peoples R China
  • [ 4 ] [Wang, Jian]Cent Univ Finance & Econo, Beijing 100081, Peoples R China
  • [ 5 ] [Guo, Jun]Beijing Univ Post & Telecommun, Sch Informat Engn, Beijing 100876, Peoples R China

Reprint Author's Address:

  • 杨震

    [Yang, Zhen]Beijing Univ Technol, Sch Comp, Beijing 100022, Peoples R China

Email:

Show more details

Related Keywords:

Related Article:

Source :

INTERNATIONAL ELECTRONIC CONFERENCE ON COMPUTER SCIENCE

ISSN: 0094-243X

Year: 2008

Volume: 1060

Page: 369-,

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 0

Online/Total:659/10645140
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.