Multi-agent Deep Reinforcement Learning based on Maximum Entropy - Details

Author：

Wang, Zihao (Wang, Zihao.) | Zhang, Yanxin (Zhang, Yanxin.) | Yin, Chenkun (Yin, Chenkun.) | Huang, Zhiqing (Huang, Zhiqing.)

Indexed by：

EI Scopus

Abstract：

Deep　reinforcement　learning　at　the　same　time　combines　the　perception　of　deep　learning　and　the　decision-making　of　reinforcement　learning,　is　currently　a　hot　research　topic　in　the　field　of　artificial　intelligence.　Multi-agent　deep　reinforcement　learning　applies　the　idea　and　algorithm　of　deep　reinforcement　learning　to　the　learning　and　control　of　multi-agent　system,　which　is　an　important　method　to　develop　multi-agent　system　with　swarm　agent.　Multi-agent　deep　deterministic　policy　gradient(MADDPG)　is　the　most　popular　model-free　multi-agent　reinforcement　learning　algorithm.　To　solve　the　problem　of　low　learning　and　training　efficiency　and　slow　convergence　speed　of　MADDPG　due　to　the　deterministic　single　action　output　of　policy　network,　this　paper　combines　the　maximum　reinforcement　learning　soft　actor　-critic　algorithm　to　make　each　agent＇s　policy　network　output　action　with　a　random　strategy　and　propose　a　multi-agent　deep　reinforcement　learning　algorithm　MASAC　based　on　maximum　entropy.　The　experimental　results　show　that　the　training　speed　of　MASAC　is　better　than　that　of　MADDPG.　At　the　same　time,　the　learning　agent　has　good　performance,　stable　performance　and　strong　anti-interference　ability.　©　2021　IEEE.

Keyword：

Multi agent systems Reinforcement learning Deep learning Decision making Learning systems Learning algorithms Entropy

Author Community：

[ 1 ] [Wang, Zihao]Beijing Jiaotong University, China
[ 2 ] [Zhang, Yanxin]Beijing Jiaotong University, China
[ 3 ] [Yin, Chenkun]Beijing Jiaotong University, China
[ 4 ] [Huang, Zhiqing]Beijing University of Technology, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Distributed Optimization of Regional Traffic Signals via Deep Reinforcement Learning
2021，40th Chinese Control Conference, CCC 2021
Reinforcement Learning-based Service Assurance of Microservice Systems
2023，Joint of the 5th International Workshop on Experience with SQuaRE Series and its Future Direction and the 11th International Workshop on Quantitative Approaches to Software Quality, IWESQ-QuASoQ 2023
Multi-agent cooperative confrontation with proximal policy optimization in urban environments
2025，Journal on Communications
Cooperative strategy learning in multi-agent environment with continuous state space
2006，2006 International Conference on Machine Learning and Cybernetics

Source ：

ISSN： 2693--2814

Year： 2021

Page： 1402-1406

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 18

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 15

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to