Feature Selection and Classification Methods for Predicting Search Engine Ranking - Details

Author：

Portier, Willy K. (Portier, Willy K..) | Li, Yujian (Li, Yujian.) | Kouassi, Bonzou A. (Kouassi, Bonzou A..)

Indexed by：

EI Scopus

Abstract：

In　the　two-past　decade,　by　using　the　methods　of　machine　learning,　the　accuracy　of　performing　computer-aided　tasks　successfully　improved.　Search　engines　(Google,　Baidu,　Bing...)　use　classification　methods　to　rank　the　billion　pages　available　on　the　world　wide　web.　Rankings　are　made　according　to　the　algorithms　with　various　features,　which　classify　each　page　for　a　search　engine　request.　The　purpose　of　this　paper　is　to　analyze　the　performance　of　various　machine　learning　models　applied　on　features　selected　through　different　techniques.　A　dataset,　composed　of　31　features　with　28,000　observations,　has　been　evaluated　considering　only　the　characteristics　with　the　highest　correlation.　To　achieve　that　goal　three　filter　methods　were　used　(Chi-square,　Gini　index　and　Fisher)　and　three　wrapper　methods　(Forward　Selection,　Backward　Elimination　and　Bidirectional　Elimination).　To　continue　the　research　various　classification　algorithms　were　tested　to　create　combination　models　with　previous　filtered　and　wrapper　methods.　Then,　a　comparison　was　done　to　determine　the　optimal　features＇　combinations,　to　improve　the　correct　prediction　for　an　URL　to　be　on　Google　Top10　SERP.　From　the　research,　it　can　be　concluded　that　for　this　dataset,　the　Random　Forest　model　combined　with　the　Fisher　filter　method　or　Backward　Elimination　wrapper　method　could　produce　the　best　results　among　others.　©　2020　ACM.

Keyword：

Classification (of information) Decision trees Search engines Computer aided instruction Feature Selection

Author Community：

[ 1 ] [Portier, Willy K.]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Li, Yujian]School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin; 541004, China
[ 3 ] [Kouassi, Bonzou A.]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Research on the eight classifications of traditional Chinese medicine biased constitution based on random forest feature selection and SMOTE+ENN algorithm
2022，2022 3rd International Conference on Computer Information and Big Data Applications, CIBDA 2022
Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning
2010，Journal of Systems and Software
A new search engine filtering scheme based on improved neural network and ontology
2010，
Feature Selection Algorithm for Dynamically Weighted Conditional Mutual Information
2021，Journal of Electronics and Information Technology

Source ：

Year： 2020

Page： 84-90

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to