计算机工程与科学2025,Vol.47Issue(1):140-149,10.DOI:10.3969/j.issn.1007-130X.2025.01.015
融合强化学习的分阶段策略求解旅行背包问题
A staged strategy incorporating reinforcement learning to solve the travelling thief problem
摘要
Abstract
The travelling thief problem(TTP)is a combination of the traditional traveling salesman problem(TSP)and the knapsack problem(KP),which is an NP-hard problem.Compared with the inde-pendent TSP and KP,the TTP is more realistic and has higher research value.Previous TTP solving al-gorithms are mainly heuristic algorithms with limited performance,and other types of algorithms are less studied.To acquire better solution for TTP,a staged strategy of incorporating reinforcement learn-ing is proposed.The first stage generates an item selection plan based on the properties of items.The second stage uses a reinforcement learning algorithm(Actor-Critic algorithm)to solve the travel path.The third stage introduces neighborhood search strategy to optimize the obtained solution.Experiments show that the proposed algorithm achieves good results on most test cases and,in some cases,outper-forms the compared algorithms in terms of solution quality,demonstrating the superior performance of the proposed algorithm.关键词
强化学习/旅行背包问题/演员-评论家算法/组合优化Key words
reinforcement learning/travelling thief problem(TTP)/Actor-Critic algorithm/combina-torial optimization分类
信息技术与安全科学引用本文复制引用
章政,夏小云,陈泽丰,向毅..融合强化学习的分阶段策略求解旅行背包问题[J].计算机工程与科学,2025,47(1):140-149,10.基金项目
国家重点研发计划(2023YFC3305900,2023YFC3305903) (2023YFC3305900,2023YFC3305903)
国家自然科学基金(62206313,61703183) (62206313,61703183)
中央高校基本科研业务费专项资金(2024ZYGXZR097) (2024ZYGXZR097)
广东省基础与应用基础研究基金(2024A1515030022) (2024A1515030022)
浙江省自然科学基金(LGG19F030010) (LGG19F030010)
嘉兴大学"勤慎"青年学者培养计划(嘉院人字[2023]12号) (嘉院人字[2023]12号)