DQ-DPS Data Partition Strategy Based on Distributed Machine Learning - Details

Author：

Wu, Jiaming (Wu, Jiaming.) | Liu, Zunhao (Liu, Zunhao.) | Yang, Bowen (Yang, Bowen.)

Indexed by：

EI Scopus

Abstract：

With　the　expansion　of　the　data　scale,　machine　learning　develops　from　centralized　to　distributed.　Generally,　distributed　machine　learning　uses　parameter　server　architecture　to　train　in　synchronous　mode.　At　this　time,　data　samples　are　statically　and　symmetrically　allocated　to　each　computing　node　according　to　the　batch　size.　Each　worker　trains　synchronously　and　iterates　until　the　model　converges.　However,　due　to　the　different　number　of　resources　at　each　compute　node　in　a　mixed-load　scenario,　the　traditional　data　partition　strategy　is　usually　to　statically　configure　batch　size　parameters　or　require　manual　setting　of　batch　size　parameters,　which　makes　the　computational　efficiency　of　distributed　machine　learning　model　training　operations　inefficient,　and　the　data　adjustment　for　each　node　will　have　an　impact　on　the　accuracy　of　the　model.　To　solve　this　problem,　on　the　premise　of　ensuring　the　accuracy　of　the　distributed　machine　learning　model　training　task,　this　paper　proposes　an　optimal　configuration　scheme　for　a　batch　size　of　distributed　machine　learning　model　training　task　data:　a　data　partition　strategy　based　on　distributed　machine　learning　(DQ-DPS).　DQ-DPS　solves　the　problem　of　low　computational　efficiency　caused　by　static　data　partitioning,　improves　the　computational　efficiency　of　distributed　machine　learning　tasks,　and　ensures　the　accuracy　of　distributed　machine　learning　training　model.　Through　a　large　number　of　experiments,　we　have　proved　the　effectiveness　of　DQ-DPS.　Compared　with　the　traditional　data　partition　strategy,　DQ-DPS　improves　the　computing　efficiency　of　each　training　round　by　38%.　©　2021　ACM.

Keyword：

Data handling Computational efficiency Machine learning Learning systems

Author Community：

[ 1 ] [Wu, Jiaming]Beijing University of Technology, China
[ 2 ] [Liu, Zunhao]Beijing University of Technology, China
[ 3 ] [Yang, Bowen]Beijing University of Technology, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Modular Policy Evaluation System: A Policy Evaluation Framework Based on Text Mining
2021，6th IEEE International Conference on Big Data Analytics, ICBDA 2021
Terrorist event prediction based on revealing data
2017，2nd IEEE International Conference on Big Data Analysis, ICBDA 2017
Telecom fraud identification based on ADASYN and random forest
2020，5th International Conference on Computer and Communication Systems, ICCCS 2020
A Survey of Encrypted Malicious Traffic Detection∗
2021，2021 IEEE International Conference on Communications, Computing, Cybersecurity and Informatics, CCCI 2021

Source ：

Year： 2021

Page： 20-26

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to