Policy Analysis of Adaptive Traffic Signal Control Using Reinforcement Learning: Fid-Bau Portal

Policy Analysis of Adaptive Traffic Signal Control Using Reinforcement Learning

Genders, Wade / Razavi, Saiedeh

Previous research studies have successfully developed adaptive traffic signal controllers using reinforcement learning; however, few have focused on analyzing what specifically reinforcement learning does differently than other traffic signal control methods. This study proposes and develops two reinforcement learning adaptive traffic signal controllers, analyzes their learned policies, and compares them to a Webster’s controller. The asynchronous Q-learning and advantage actor-critic adaptive algorithms are used to develop reinforcement learning traffic signal controllers using neural network function approximation with two action spaces. Using an aggregate statistic state representation (i.e., vehicle queue and density), the proposed reinforcement learning traffic signal controllers develop the optimal policy in a dynamic, stochastic traffic microsimulation. Results show that the reinforcement learning controllers increases red and yellow times but ultimately achieve superior performance compared to the Webster’s controller, reducing mean queues, stopped time, and travel time. The reinforcement learning controllers exhibit goal-oriented behavior, developing a policy that excludes many phases found in a tradition phase cycle (i.e., protected turning movements) instead of choosing phases that maximize reward, as opposed to the Webster’s controller, which is constrained by cyclical logic that diminishes performance.