Analysis learning prospects of smart autonomous logistics systems based on value function optimization

Autonomous logistic systems require effective decision-making techniques under the conditions of uncertainty and dynamically changing environments. Reinforcement learning is a promising approach that allows systems to search for optimal action strategies autonomously. Such systems do not need to accumulate additional historical data, compared to systems based on other methods. When reinforcement learning is applied, the system learns to make decisions based on analyzing its own errors, which can be useful for logistics. There exists a diversity of methods of reinforcement learning, making the selection of the most appropriate method for a particular task an important problem. In this work, we set out to analyze and classify reinforcement learning methods according to various criteria in order to identify their advantages, disadvantages, and areas of effective application. Special attention is paid to the analysis of methods with value optimization: Q-Learning, SARSA, and Deep Q-Network. The advantages and disadvantages of each method are described in the context of logistic problems; examples of their successful application in the field of logistics are considered. The most promising directions of their application are identified; recommendations on the selection of a particular method for solving problems in autonomous logistic systems are formulated.

Authors: N. A. Verzun, M. O. Kolbanev, A. R. Salieva

Direction: Informatics, Computer Technologies And Control

Keywords: autonomous logistics systems, reinforcement learning, value function optimization, Q-Learning, SARSA, Deep Q-Network

View full article