|国家科技期刊平台
首页|期刊导航|重庆理工大学学报|自动化立体仓库退库货位优化问题及其求解算法

自动化立体仓库退库货位优化问题及其求解算法OA北大核心CSTPCD

Retrieval-oriented storage relocation problem of an automated storage and retrieval system and its solving algorithm

中文摘要英文摘要

针对自动化立体仓库出库作业过程中剩余货物退库问题,以堆垛机作业总能耗最小化为目标,以退库货位分配为决策变量,建立了自动化立体仓库退库货位优化模型,提出了基于深度强化学习的自动化立体仓库退库货位优化框架.在该框架内,以立体仓库实时存储信息和出库作业信息构建多维状态,以退库货位选择构建动作,建立自动化立体仓库退库货位优化的马尔科夫决策过程模型;将立体仓库多维状态特征输入双层决斗网络,采用决斗双重深度Q网络(dueling double deep Q-network,D3QN)算法训练网络模型并预测退库动作目标价值,以确定智能体的最优行为策略.实验结果表明D3QN算法在求解大规模退库货位优化问题上具有较好的稳定性.

The efficient retrieval handling of storage products in automated storage and retrieval system ( AS/RS ) is the key for just-in-time production in industrial production systems. In a production-oriented AS/RS, the amount of a specific product to be retrieved is often less than a pallet load. On completion of each retrieval request, the remaining products on the pallet should be relocated to a specific rack location. In current practice, after retrieving certain items, a non-empty pallet is typically returned to the originally designated location. It is of significant importance to dynamically re-assigning the storage locations of non-empty pallets after each retrieval in order to improve retrieval operational efficiency as well as energy consumption. In this study, a retrieval-oriented storage relocation problem is investigated to minimize energy consumption. Based on the characteristics of the acceleration and deceleration motion of the stacker crane and the crane's dual operation mode, an energy consumption model of AS/RS during retrieval processes is first established. Then the relocation of non-empty pallet after each retrieval request is taken as an operational decision and the constraints including rack stability and maximum storage capacity of each rack location are considered, and an optimization model of retrieval-oriented storage relocation is built with the objective of minimizing the total energy consumption during the retrieval processes. In consideration of the NP-hard characteristic of the optimization model, this paper proposes an optimization framework based on deep reinforcement learning algorithm. Within this framework, the multi-dimensional state is designed based on the real-time storage information as well as the retrieval information and the action is constructed by the retrieval-oriented storage relocation. A Markov decision process model for retrieval-oriented storage relocation is then established and the dueling double deep Q-networks ( D3QN ) algorithm is employed to obtain the optimal storage relocation of non-empty pallet after each retrieval request. In D3QN algorithm, the multi-dimensional state of AS/RS is put into the dueling network to predict main Q value as well as target Q value and the network training mechanism in the double deep Q network algorithm is introduced to train both main network and target network. When the network model converges to the optimal value function, the intelligent agent finds an optimal rack location for the non-empty pallet after each retrieval request based on the real-time state of AS/RS, so as to minimize energy consumption demanded by the stacker crane. Finally, the experimental cases of different scales are designed under the variability of rack shapes and product retrieval frequency. A set of sensitive analysis experiments are conducted to determine the hyper-parameters of D3QN, including training learning rate, the size of training batch, the number of neurons in the hidden layer and network update frequency. To evaluate the effectiveness of the proposed algorithm, extensive comparisons between D3QN, DDQN and DQN are made under different cases. Our experimental results show D3QN algorithm obtains better solution performance in dealing with the large-scale retrieval-related relocation cases. Moreover, a comparative analysis of the proposed relocation method and the method of returning to the original storage location is conducted under different cases. Our experimental results show the proposed relocation method significantly reduces total energy consumption during retrieval processes. Our study considers the sequence of retrieval requests as given rather than decision variables. If retrieval sequencing is incorporated with retrieval-oriented storage relocation, further improvement on retrieval efficiency and energy consumption may be achieved. Thus, further investigations are needed on the integrated optimization of retrieval-oriented storage relocation and retrieval sequencing for AS/RS.

何在祥;李丽;张云峰;郗琳

西南大学 工程技术学院,重庆 400715

计算机与自动化

自动化立体仓库退库货位优化深度强化学习D3QN

automated storage and retrieval systemretrieval-oriented storage relocationdeep reinforcement learningdueling double deep Q-networks

《重庆理工大学学报》 2024 (005)

183-194 / 12

重庆市杰出青年科学基金项目(CSTB2022NSCQ-JQX0030);中央高校基本科研业务费专项资金项目(SWU-XD-JH202302)

10.3969/j.issn.1674-8425(z).2024.03.020

评论