• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Tang, Y. (Tang, Y..) | Yuan, T. (Yuan, T..) | Cao, F. (Cao, F..) | Wang, L. (Wang, L..) | Guo, Z. (Guo, Z..) | Zhao, Y. (Zhao, Y..) | Li, R. (Li, R..)

Indexed by:

CPCI-S EI Scopus

Abstract:

As we know, the training of Large Language Models (LLM) is time-consuming and expensive. Its training efficiency is often affected by both the heterogeneous computing devices and heterogeneous communication networks in the computing cluster. In recent years, new computing devices and technologies such as NVIDIA H200 and Compute Express Link (CXL) 3.0 have been proposed, bringing new opportunities for improving the training efficiency of LLM. However, the actual deployment difficulty and cost of these new devices or technologies are extremely high, so it is difficult for researchers to evaluate their impacts or improvements on LLM training. In order to solve this problem, this paper introduces a simulation tool named HeterSim, and proposes to simulate and evaluate LLM training in CXL-based heterogeneous computing clusters using HeterSim. This article takes the LLM called LLaMA as a simulation example, and successfully simulates and analyzes the impact of heterogeneous computing and CXL technologies on LLM training. We hope that this article can provide researchers with new ideas for simulating and analyzing LLM training, and help researchers explore the impact of emerging technologies on LLM training at low cost. © 2024 IEEE.

Keyword:

CXL Distributed training Simulation Heterogeneous computing Large language model

Author Community:

  • [ 1 ] [Tang Y.]Inspur Electronic Information Industry Co., Ltd, China
  • [ 2 ] [Yuan T.]Beijing University of Technology, China
  • [ 3 ] [Cao F.]Inspur Electronic Information Industry Co., Ltd, China
  • [ 4 ] [Wang L.]Inspur Electronic Information Industry Co., Ltd, China
  • [ 5 ] [Guo Z.]Inspur Electronic Information Industry Co., Ltd, China
  • [ 6 ] [Zhao Y.]Inspur Electronic Information Industry Co., Ltd, China
  • [ 7 ] [Li R.]Inspur Electronic Information Industry Co., Ltd, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2024

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 9

Affiliated Colleges:

Online/Total:533/10625886
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.