Warehouse Robotics Optimization using Reinforcement Learning
Sabyasachi Chakraborty Chakraborty
Paper Contents
Abstract
Our project titled Warehouse Robotics Optimization using Reinforcement Learning aims to improve the design of multi-robot routes and prevent collisions in warehouse environments using Q-frames. Robots move from job sites to warehouses, avoiding problems and performing efficiently. Route planning uses Q-learning, where robots automatically explore the environment, receiving rewards for reaching goals and penalties for collisions. To accelerate learning, the competitive learning method divides the task into three subtasks. The first step involves training a single agent on the warehouse map up to 3,000 locations, without targeted rewards, only penalties for connections, to determine it locally. Stage 2 trains the robots to return to operations from random positions, again improving success. Stage 3 replicates the knowledge from the previous steps to train the robots to move to the final goals. This evolutionary training dramatically accelerates learning, reduces synchronization time from 2,500 to just 50, and allows new robots to be integrated seamlessly without initialization. Control methods are implemented for both dynamic and static problems. For robot-to-robot encounters, the robots compare distances of Manhattan with their targets, giving priority to the robot with the shortest distance. With human operators or forklifts, the robots wait up to 5 seconds before resuming motion. Static problems are delivered to all agents, whereas temporary problems trigger a reconfiguration if an agent waits for more than 5 seconds. This flexible and efficient approach ensures strong multi-agent collaboration, improves warehouse performance and overall system efficiency
Copyright
Copyright © 2024 Sabyasachi Chakraborty. This is an open access article distributed under the Creative Commons Attribution License.