• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Qu, Zhijie (Qu, Zhijie.)

Indexed by:

EI

Abstract:

Since the introduction of the GPT model as the mainstream model for dialog generation, a hot issue is how to enable the model to extend the prediction length and generate longer dialog texts while the training context length remains constant. Rotary incorporates absolute position information in the form of relative position encoding, an approach that has been shown to be advantageous for extrapolating on transformer models. However, in the actual extrapolation process, when the prediction length is too long, its perplexity climbs sharply. In order to solve the above problems, this paper proposes a method of scaling frequency basis. In contrast to the previous manual test for selecting the optimal parameter I', this method allows the model to dynamically train the parameters based on the input sequence length to obtain the optimal I'. Using this method, we demonstrate that the GPT model can effectively extrapolate to contexts that are longer than its original pre-Training length. In addition, in order to verify the effectiveness of this method in practical applications, this paper applies it to the field of medical dialog generation, which is a more complex scenario, and the experimental results demonstrate that the extrapolation length is several times the length of the training context, which greatly enhances the extrapolation performance compared to the original absolute position coding. © 2023 ACM.

Keyword:

Embeddings Extrapolation

Author Community:

  • [ 1 ] [Qu, Zhijie]Beijing University of Technology, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2023

Page: 166-170

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 2

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 3

Affiliated Colleges:

Online/Total:600/10552297
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.