首页|期刊导航|江苏大学学报(自然科学版)|基于CrossFormer的自动驾驶车辆周边行人轨迹预测

基于CrossFormer的自动驾驶车辆周边行人轨迹预测OA北大核心

Trajectory prediction of pedestrians around autonomous vehicles based on CrossFormer

中文摘要英文摘要

在自动驾驶车辆周边行人轨迹预测中,针对现有方法在长时预测任务中表现不足,尤其是对复杂场景适应性较低的问题,提出了一种新方法.给出了行人轨迹预测问题建模流程,构建了基于CrossFormer的行人轨迹预测方法.新方法通过维度分段嵌入技术显式学习相邻时间帧的相关性;结合两阶段注意力机制层,充分学习行人轨迹的长时依赖关系;利用分层编码器-解码器结构,自适应地捕获行人轨迹在不同时间尺度上的依赖性,提高模型在长时预测上的可扩展性.新方法创新性地结合了多模态信息融合、自注意力机制和可扩展性优化,实现了对行人轨迹预测任务的高效解决.在ETH轨迹数据、江苏大学校园内行人轨迹数据(JDD)这两个数据集完成了试验,进行了时间序列的分割性分析以及定量、定性分析.结果表明,在ETH数据集上,新方法的平均位移误差ADE、最终位移误差FDE值分别为0.627、1.32,均显著优于传统方法如LSTM(0.895、1.74)和SR-LSTM(0.728、1.66)等;在JDD数据集上,新方法的ADE、FDE值分别为0.281、0.53,远优于GAN(0.562、1.01)、STGAT(0.673、1.43)等模型;新方法在复杂场景下的鲁棒性和泛化能力也得到了验证.

To solve the problems of long-term pedestrian trajectory prediction around autonomous vehicles with insufficient performance and poor adaptability to complex scenarios by the existing methods,the novel method was proposed.The pedestrian trajectory prediction problem was modeled,and the CrossFormer-based pedestrian trajectory prediction method was developed.The dimension-segment-wise(DSW)embedding technology was introduced to explicitly learn the correlations between adjacent time frames,and the two-stage self-attention mechanism (TSA) was combined to comprehensively capture the long-term dependencies of pedestrian trajectories.The hierarchical encoder-decoder structure was employed to adaptively capture pedestrian trajectory dependencies at different time scales for enhancing the model scalability in long-term prediction.The multi-modal information fusion,the self-attention mechanisms and the scalability optimization were innovatively integrated to achieve efficient solution for pedestrian trajectory prediction tasks.The experiments were conducted on the two datasets of ETH and Jiangsu University campus pedestrian trajectory data(JDD).The time series segmentation analysis and quantitative and qualitative evaluations were performed.The results show that by the proposed method,the values of average displacement error (ADE) and final displacement error (FDE) are respective 0.627 and 1.32 on the ETH dataset,which are significantly better than those by the traditional methods of LSTM with 0.895 and 1.74 and SR-LSTM with 0.728 and 1.66.On the JDD dataset,the values of ADE and FDE are 0.281 and 0.53,respectively,which are far superior to those of GAN with 0.562 and 1.01 and STGAT with 0.673 and 1.43.The robustness and generalization ability of the proposed method in complex scenarios are verified.

曹瑞阳;李诗雨;刘擎超;丁延超

江苏大学京江学院,江苏镇江 212028||香港理工大学工程学院,香港 999077江苏大学汽车工程研究院,江苏镇江 212013江苏大学汽车工程研究院,江苏镇江 212013清华大学苏州汽车研究院,江苏苏州 215200

计算机与自动化

自动驾驶行人轨迹预测CrossFormerTransformer注意力机制深度学习复杂场景分析多模态数据融合预测精度

autonomous drivingpedestrian trajectory predictionCrossFormerTransformerattention mechanismdeep learningcomplex scene analysismulti-modal data fusionprediction accuracy

《江苏大学学报(自然科学版)》 2025 (3)

276-283,308,9

国家自然科学基金资助项目(52372413)

10.3969/j.issn.1671-7775.2025.03.004

评论