Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization - Details

Author：

Indexed by：

Abstract：

Large　Language　Models　(LLMs)　exhibit　robust　problem-solving　capabilities　for　diverse　tasks.　However,　most　LLM-based　agents　are　designed　as　specific　task　solvers　with　sophisticated　prompt　engineering,　rather　than　agents　capable　of　learning　and　evolving　through　interactions.　These　task　solvers　necessitate　manually　crafted　prompts　to　inform　task　rules　and　regulate　LLM　behaviors,　inherently　incapacitating　to　address　complex　dynamic　scenarios　e.g.,　large　interactive　games.　In　light　of　this,　we　propose　Agent-Pro:　an　LLM-based　Agent　with　Policy-level　Reflection　and　Optimization　that　can　learn　a　wealth　of　expertise　from　interactive　experiences　and　progressively　elevate　its　behavioral　policy.　Specifically,　it　involves　a　dynamic　belief　generation　and　reflection　process　for　policy　evolution.　Rather　than　action-level　reflection,　Agent-Pro　iteratively　reflects　on　past　trajectories　and　beliefs,　＇fine-tuning＇　its　irrational　beliefs　for　a　better　policy.　Moreover,　a　depth-first　search　is　employed　for　policy　optimization,　ensuring　continual　enhancement　in　policy　payoffs.　Agent-Pro　is　evaluated　across　two　games:　Blackjack　and　Texas　Hold＇em,　outperforming　vanilla　LLM　and　specialized　models.　Our　results　show　Agent-Pro　can　learn　and　evolve　in　complex　and　dynamic　scenes,　which　also　benefits　numerous　LLM-based　applications.　©　2024　Association　for　Computational　Linguistics.

Keyword：

Contrastive Learning Computational linguistics Problem oriented languages Behavioral research

Author Community：

[ 1 ] [Zhang, Wenqi]College of Computer Science and Technology, Zhejiang University, China
[ 2 ] [Tang, Ke]Institute of Software, Chinese Academy of Sciences, China
[ 3 ] [Tang, Ke]Nanjing Institute of Software Technology, China
[ 4 ] [Tang, Ke]Nanjing University of Posts and Telecommunications, China
[ 5 ] [Tang, Ke]University of Chinese Academy of Sciences, Nanjing, China
[ 6 ] [Wu, Hai]Institute of Software, Chinese Academy of Sciences, China
[ 7 ] [Wu, Hai]Nanjing Institute of Software Technology, China
[ 8 ] [Wu, Hai]Nanjing University of Information Science and Technology, China
[ 9 ] [Wu, Hai]University of Chinese Academy of Sciences, Nanjing, China
[ 10 ] [Wang, Mengna]Institute of Software, Chinese Academy of Sciences, China
[ 11 ] [Wang, Mengna]Beijing University of Technology, China
[ 12 ] [Shen, Yongliang]College of Computer Science and Technology, Zhejiang University, China
[ 13 ] [Hou, Guiyang]College of Computer Science and Technology, Zhejiang University, China
[ 14 ] [Tan, Zeqi]College of Computer Science and Technology, Zhejiang University, China
[ 15 ] [Li, Peng]Institute of Software, Chinese Academy of Sciences, China
[ 16 ] [Li, Peng]Nanjing Institute of Software Technology, China
[ 17 ] [Li, Peng]University of Chinese Academy of Sciences, Nanjing, China
[ 18 ] [Zhuang, Yueting]College of Computer Science and Technology, Zhejiang University, China
[ 19 ] [Lu, Weiming]College of Computer Science and Technology, Zhejiang University, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Experiential Co-Learning of Software-Developing Agents
2024，62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
See Detail Say Clear: Towards Brain CT Report Generation via Pathological Clue-driven Representation Learning
2024，2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
ALFL: A Federated Learning Client Selection Algorithm for Heterogeneous Data
2024，5th International Conference on Big Data and Artificial Intelligence and Software Engineering, ICBASE 2024
PFedBEA: Combatting Data Heterogeneity for Personalized Federated Learning by Body Exchange and Aggregation Abandon
2024，2024 International Joint Conference on Neural Networks, IJCNN 2024

Source ：

ISSN： 0736-587X

Year： 2024

Volume： 1

Page： 5348-5375

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 21

Affiliated Colleges：

Get Fulltext

Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to