Indexed by:
Abstract:
Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using pre-trained Large Language Models (LLMs) without explicit prompt engineering to generate future motions from vehicular past trajectories and traffic scene semantics. Traj-LLM starts with sparse context joint encoding to dissect the agent and scene features into a form that LLMs understand. On this basis, we creatively explore LLMs' strong understanding capability to capture a spectrum of high-level scene knowledge and interactive information. To emulate the human-like lane focus cognitive function and enhance Traj-LLM's scene comprehension, we introduce lane-aware probabilistic learning powered by the Mamba module. Finally, a multi-modal Laplace decoder is designed to achieve scene-compliant predictions. Extensive experiments manifest that Traj-LLM, fueled by prior knowledge and understanding prowess of LLMs, together with lane-aware probability learning, transcends the state-of-the-art methods across most evaluation metrics. Moreover, the few-shot analysis serves to substantiate Traj-LLM's performance, as even with merely 50% of the dataset, it surpasses the majority of benchmarks relying on complete data utilization. This study explores endowing the trajectory prediction task with advanced capabilities inherent in LLMs, furnishing a more universal and adaptable solution for forecasting agent movements in a new way. IEEE
Keyword:
Reprint Author's Address:
Email:
Source :
IEEE Transactions on Intelligent Vehicles
ISSN: 2379-8858
Year: 2024
Page: 1-14
8 . 2 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count: 9
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 3
Affiliated Colleges: