Comparative Study of Reinforcement Learning Algorithms on Traffic Light Control System


  • Partha Ghosh
  • Anirban Dan
  • Abhi Goswami
  • Nilagnik Chakraborty
  • Amit Chakraborty



Traffic Light Control System (TLCS), Reinforcement Learning (RL), Deep Reinforcement learning (DRL), Deep Q Network (DQN), Deep Deterministic Policy Gradient (DDPG), Dueling Double Deep Q-Network (D3QN)


With the changing times, the need of upgraded and efficient state-of-the-art traffic light control system is truly required. How well do various state-of-the-art algorithms handle complex real-life situations, more technically speaking, how well the reward function operates and minimizes the waiting time, is the prime question of the hour. Here we have successfully employed three reinforcement learning agents-REINFORCE, D3QN and DDPG each of which can independently handle large volumes of traffic movement and minimize signal waiting time, decision-making capability is strictly stochastic and beyond blind guess. The D3QN agent is characterized as a fast learner with only value-based approach. REINFORCE utilizes policy-based optimization and performs close to D3QN when trained on a higher number of episodes. DDPG utilizes both value and policy based approach to reduce waiting time; however the training time is significantly higher due to its complex network architecture. Our proposed work lays a foundation on multi-agent as well as mixed modelling decision making algorithms for traffic light control system which shall be a significant step in not only bringing down emissions and preserving ecology but also open various new research domains, especially in real time multimedia processing by intelligent agents.