Strategy optimization in reinforcement learning algorithms in logistic decision-making systems

This review paper aims to analyze and systematize the current research in the field of strategy optimization of reinforcement learning algorithms used in logistic decision-making systems. In the course of the review we have considered scientific publications for the last 5 years, indexed in the leading databases, devoted to the application of reinforcement learning methods in logistics. Particular attention is paid to papers describing Policy Gradient and Proximal Policy Optimization (PPO) algorithms. The methodology of the review includes comparative analysis, classification of approaches and evaluation of their effectiveness. The main trends in the development of policy optimization methods for logistics systems are identified. The key advantages and limitations of different approaches are identified. It is found that PPO-based methods demonstrate the highest efficiency in complex dynamic environments. A growing interest in hybrid approaches combining reinforcement learning and classical optimization methods is found. Promising directions for further research are highlighted, including adapting algorithms to specific logistics problems and improving their interpretability. The results obtained can serve as a basis for the development of new algorithms and their practical application in various sectors of logistics and supply chain management.

Authors: A. R. Salieva, N. A. Verzun, M. O. Kolbanev

Direction: Informatics, Computer Technologies And Control

Keywords: logistic decision-making systems, reinforcement learning, strategy optimization, Policy Gradient methods, Proximal Policy Optimization


View full article