• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Jiang, Zong-Li (Jiang, Zong-Li.) (Scholars:蒋宗礼) | Lu, Guo-Xiang (Lu, Guo-Xiang.)

Indexed by:

EI Scopus PKU CSCD

Abstract:

How to find what a user wants in tremendous amount of Web information is a great challenge to web search engine. By focusing downloading web pages on a given domain, focused crawlers can save a great deal of works and improve the quality of the information they provide. We put forward a method of focused crawling-MatchLink. It uses document vector model to evaluate topic relevance of the anchor and uses Naive Bayes algorithm and multilayer classification method to compute the topic relevance of the web page containing the anchor. According to these two relevancies, topic relevant web pages have prior claim to be downloaded. Experiment shows that the result is better than BestFirst and BreadthFirst.

Keyword:

Algorithms Search engines Websites Classification (of information)

Author Community:

  • [ 1 ] [Jiang, Zong-Li]College of Computer Science, Beijing University of Technology, Beijing 100022, China
  • [ 2 ] [Lu, Guo-Xiang]College of Computer Science, Beijing University of Technology, Beijing 100022, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Journal of Beijing University of Technology

ISSN: 0254-0037

Year: 2007

Issue: 11

Volume: 33

Page: 1227-1232

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 25

Online/Total:816/10801124
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.