• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Wang, Aiqing (Wang, Aiqing.) | Zhang, Sen (Zhang, Sen.)

Indexed by:

EI Scopus

Abstract:

In Chinese and many other Asian languages which are based on non-ASCII alphabet, words are not delimited with whitespace (space, tab etc.), and word boundaries must therefore be reconstructed. Further syntactic analysis is based on the output of word segmentation result. Ambiguity and unregistered words are the most important problems in Chinese word segmentation. In this paper we analyzed the ambiguous reasons and presented a one-pass scan method for the detection and modification of ambiguous cases. To deal with the unregistered words and special words (such as names), we proposed a combination method that can recognize new words, hence the accuracy can be increased. In the realization, we used the bisection search method to look up words in a large dictionary (more than 40,000 items), and the average search cost for a word is less than 16 operations, so the speed is satisfactory if the system is embedded into Chinese understanding systems or Chinese speech processing systems. © 2007 IEEE.

Keyword:

Syntactics Image segmentation Character recognition Natural language processing systems Speech processing

Author Community:

  • [ 1 ] [Wang, Aiqing]Department of Mathematics, Qingdao Technological University, Qingdao, China
  • [ 2 ] [Zhang, Sen]Information and Computing Sci. Lab., Beijing University of Technology, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2007

Volume: 3

Page: 738-743

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Online/Total:433/10601984
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.