Research on Neural Machine Translation Model - Details

Author：

Chen, Mengyao (Chen, Mengyao.) | Li, Yong (Li, Yong.) | Li, Runqi (Li, Runqi.)

Indexed by：

EI Scopus

Abstract：

In　neural　machine　translation　(NMT),　cyclic　neural　networks,　especially　long-term　and　short-term　memory　networks　and　gated　recurrent　neural　networks,　have　been　regarded　as　the　latest　methods　for　sequence　modeling　and　transduction　problems　for　a　long　time,　such　as　language　modeling　and　machine　translation.　When　the　cyclic　neural　network　is　running,　the　sequence　information　is　processed　one　by　one,　strictly　following　the　order　from　left　to　right　or　from　right　to　left,　processing　one　word　at　a　time,　and　parallel　operation　cannot　be　realized,　resulting　in　slow　running　speed.　With　the　rapid　development　of　neural　machine　translation　(NMT)　network　architecture,　cyclic　neural　network　has　been　effectively　replaced　by　convolution　network　and　self-　A　ttention.　Convolution　neural　network　has　replaced　the　divine　circulation　neural　network　due　to　its　parallel　computation　of　convolution.　The　Transformer　model　replaces　the　long-term　and　short-term　memory　network　with　a　complete　self-attention　structure,　and　abandons　the　traditional　encoder　and　decoder　model　which　must　combine　the　inherent　mode　of　convolutional　neural　network　or　circular　neural　network　and　only　uses　the　self-attention　mechanism.　Although　the　biggest　innovation　of　Transformer　architecture　is　to　use　full　self-　A　ttention,　there　are　several　other　factors,　such　as　multi-head　attention　and　residual　connection.　The　model　flexibly　combine　several　common　building　blocks　in　the　Transformer　architecture　with　the　cyclic　neural　network.　By　borrowing　the　framework　of　the　Transformer　architecture　without　using　full　self-　A　ttention,　experiments　show　that　the　cyclic　model　can　be　very　close　to　the　performance　of　the　Transformer　Our　model　achieved　26.7　BLEU　in　the　WMT　2014　English　to　German　translation　task　and　37.8　BLEU　in　the　WMT　2014　English　to　French　translation　task.　Using　these　two　scores　alone　is　very　close　to　the　score　of　the　Transformer　architecture　using　full　attention,　so　even　if　the　cyclic　neural　network　is　used　instead　of　full　self-　A　ttention,　it　can　perform　well　on　the　data　set.　©　2019　IOP　Publishing　Ltd.　All　rights　reserved.

Keyword：

Recurrent neural networks Memory architecture Signal processing Convolution Brain Network architecture Intelligent computing Computational linguistics Convolutional neural networks Modeling languages Computer aided language translation

Author Community：

[ 1 ] [Chen, Mengyao]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China
[ 2 ] [Li, Yong]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China
[ 3 ] [Li, Runqi]Institute of Information, Beijing University of Technology, No100 Pingleyuan, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Decoding with value networks for neural machine translation
2017，31st Annual Conference on Neural Information Processing Systems, NIPS 2017
Epileptic Seizure Prediction Based on Convolutional Recurrent Neural Network with Multi-Timescale
2019，9th International Conference on Intelligence Science and Big Data Engineering, IScIDE 2019
Research on Task-oriented Dialogue Based on Modified Transformer
2020，2020 5th International Conference on Intelligent Computing and Signal Processing, ICSP 2020
Research on recognition method of transportation modes based on deep learning
2019，Journal of Harbin Institute of Technology

Source ：

ISSN： 1742-6588

Year： 2019

Issue： 5

Volume： 1237

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 11

Affiliated Colleges：

学院待认领

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to