Undo the codebook bias by linear transformation for visual applications - Details

Author：

Indexed by：

EI Scopus

Abstract：

The　bag　of　visual　words　model　(BoW)　and　its　variants　have　demonstrate　their　effectiveness　for　visual　applications　and　have　been　widely　used　by　researchers.　The　BoW　model　first　extracts　local　features　and　generates　the　corresponding　codebook,　the　elements　of　a　codebook　are　viewed　as　visual　words.　The　local　features　within　each　image　are　then　encoded　to　get　the　final　histogram　representation.　However,　the　codebook　is　dataset　dependent　and　has　to　be　generated　for　each　image　dataset.　This　costs　a　lot　of　computational　time　and　weakens　the　generalization　power　of　the　BoW　model.　To　solve　these　problems,　in　this　paper,　we　propose　to　undo　the　dataset　bias　by　codebook　linear　transformation.　To　represent　every　points　within　the　local　feature　space　using　Euclidean　distance,　the　number　of　bases　should　be　no　less　than　the　space　dimensions.　Hence,　each　codebook　can　be　viewed　as　a　linear　transformation　of　these　bases.　In　this　way,　we　can　transform　the　pre-learned　codebooks　for　a　new　dataset.　However,　not　all　of　the　visual　words　are　equally　important　for　the　new　dataset,　it　would　be　more　effective　if　we　can　make　some　selection　using　sparsity　constraints　and　choose　the　most　discriminative　visual　words　for　transformation.　We　propose　an　alternative　optimization　algorithm　to　jointly　search　for　the　optimal　linear　transformation　matrixes　and　the　encoding　parameters.　Image　classification　experimental　results　on　several　image　datasets　show　the　effectiveness　of　the　proposed　method.　Copyright　©　2013　ACM.

Keyword：

Linear transformations Mathematical transformations Classification (of information)

Author Community：

[ 1 ] [Zhang, Chunjie]School of Computer and Control Engineering, University of Chinese Academy of Sciences, 100049, Beijing, China
[ 2 ] [Zhang, Yifan]National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
[ 3 ] [Wang, Shuhui]Key Lab of Intell. Info. Process, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
[ 4 ] [Pang, Junbiao]College of Computer Science and Technology, Beijing University of Technology, 100124 Beijing, China
[ 5 ] [Liang, Chao]School of Computer, National Engineering Research Center for Multimedia Software, Wuhan University, 430072, Wuhan, China
[ 6 ] [Huang, Qingming]School of Computer and Control Engineering, University of Chinese Academy of Sciences, 100049, Beijing, China
[ 7 ] [Huang, Qingming]Key Lab of Intell. Info. Process, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
[ 8 ] [Tian, Qi]Department of Computer Sciences, University of Texas at San Antonio, San Antonio, TX 78249, United States

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Enhanced LLE features classification for face recognition
2012，Transaction of Beijing Institute of Technology
Pose Estimation Method of Rotor UAV Based on Visual Mark Detection
2019，Transactions of the Chinese Society for Agricultural Machinery
Average dwell-time conditions of switching information topologies for consensus of linear multi-agent systems
2013，32nd Chinese Control Conference, CCC 2013
LDPC coded MIMO communication system with time varying linear transformation
2010，

Source ：

Year： 2013

Page： 533-536

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 4

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 7

Affiliated Colleges：

信息科学技术学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to