Learning to Solve Vehicle Routing Problem with Soft Time Window via Collaborative Transformer

Authors: Zengjian Yang, Junqing Li, Xiaolong Chen
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: 400-411
Keywords: Vehicle routing problem with soft time windows, Transformer, Deep reinforcement learning.

Abstract

Over the past several years, there has been a rapid evolution in harnessing advanced deep reinforcement learning techniques to address challenges, including but not limited to the traveling salesperson problem TSP and the vehicle routing problem VRP . However, the effectiveness of existing deep architectures for the vehicle routing problem with soft time windows VRPSTW is compromised by their integration of node and positional information into a single unified representation. In this article, we design a novel Collaborative Transformer framework based on deep reinforcement learning architecture to learn the node features eg.locations, time window and positional features separately to avoid incompatible correlations , so as to improve the learning ability. During training, we leverage the Proximal Policy Optimization PPO algorithm to update the parameters of the model. This CT architecture serves as the policy network in the PPO framework. Tested on three datasets with customer points of 20, 50 and 100 respectively, experiments show that our method outperforms existing DRL architecture, showcasing its effectiveness in solving the given task.
📄 View Full Paper (PDF) 📋 Show Citation