• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Chen, Mengyao (Chen, Mengyao.) | Li, Yong (Li, Yong.) | Li, Runqi (Li, Runqi.)

Indexed by:

EI Scopus

Abstract:

In neural machine translation (NMT), cyclic neural networks, especially long-term and short-term memory networks and gated recurrent neural networks, have been regarded as the latest methods for sequence modeling and transduction problems for a long time, such as language modeling and machine translation. When the cyclic neural network is running, the sequence information is processed one by one, strictly following the order from left to right or from right to left, processing one word at a time, and parallel operation cannot be realized, resulting in slow running speed. With the rapid development of neural machine translation (NMT) network architecture, cyclic neural network has been effectively replaced by convolution network and self- A ttention. Convolution neural network has replaced the divine circulation neural network due to its parallel computation of convolution. The Transformer model replaces the long-term and short-term memory network with a complete self-attention structure, and abandons the traditional encoder and decoder model which must combine the inherent mode of convolutional neural network or circular neural network and only uses the self-attention mechanism. Although the biggest innovation of Transformer architecture is to use full self- A ttention, there are several other factors, such as multi-head attention and residual connection. The model flexibly combine several common building blocks in the Transformer architecture with the cyclic neural network. By borrowing the framework of the Transformer architecture without using full self- A ttention, experiments show that the cyclic model can be very close to the performance of the Transformer Our model achieved 26.7 BLEU in the WMT 2014 English to German translation task and 37.8 BLEU in the WMT 2014 English to French translation task. Using these two scores alone is very close to the score of the Transformer architecture using full attention, so even if the cyclic neural network is used instead of full self- A ttention, it can perform well on the data set. © 2019 IOP Publishing Ltd. All rights reserved.

Keyword:

Recurrent neural networks Memory architecture Signal processing Convolution Brain Network architecture Intelligent computing Computational linguistics Convolutional neural networks Modeling languages Computer aided language translation

Author Community:

  • [ 1 ] [Chen, Mengyao]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China
  • [ 2 ] [Li, Yong]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China
  • [ 3 ] [Li, Runqi]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

ISSN: 1742-6588

Year: 2019

Issue: 5

Volume: 1237

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count: 8

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 11

Affiliated Colleges:

Online/Total:995/10567199
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.