Towards adaptive traffic pattern clustering reinforcement learning in traffic signal control
Publication date
2026-05-07
Document type
Konferenzbeitrag
Author
Klein, Lukas
Redeker, Magnus
Organisational unit
Fraunhofer IOSB, IOSB-INA
Publisher
Universitätsbibliothek der HSU/UniBw H
Book title
Machine learning for cyber physical systems : proceedings of the conference ML4CPS 2026
First page
73
Last page
83
Peer-reviewed
✅
Part of the university bibliography
Nein
File(s)
Language
English
Keyword
Traffic signal control
Online adaptive traffic pattern clustering
Cluster-specific reinforcement learning
Time-to-real-world-deployment reduction
Abstract
In traffic signal control (TSC), real-time adaptation to fluctuating traffic patterns (TPs) is crucial for efficiency and safety. Traditional fixed-time and actuated schemes struggle under non-stationary demand, incidents, and the growing heterogeneity of urban traffic. While reinforcement learning (RL) offers powerful controllers, many RL approaches rely on fixed contexts or a single monolithic model, limiting learning efficiency and deployment in large networks. Consequently, this paper proposes a modular, online framework that (i) clusters traffic patterns online into a small set of clusters, (ii) trains cluster-specific RL agents online using cluster-specific simulation models, (iii) initializes new cluster agents from the most similar existing cluster to accelerate learning, and (iv) merges the most similar clusters when the maximum number of clusters is exceeded, retaining the agent that best covers the traffic space. The similarity metric used to compare TP trajectories and the simulation model used for RL training are applied to the same TP trajectories, ensuring consistent measurement and environment parametrization across clustering, training and real-time inference. The result is an adaptive TP-RLTSC system that preserves accuracy while reducing computational burden and time-to-real-world deployment. The approach builds on adaptive two-scale ideas where macro-evolution guides micro-level control; this philosophy is adapted to online TP clustering and cluster-specific RL.
This paper presents a conceptual TP-RL-TSC framework focusing on architecture, design rationale and workflows; empirical evaluation and deployment studies are planned for future work.
This paper presents a conceptual TP-RL-TSC framework focusing on architecture, design rationale and workflows; empirical evaluation and deployment studies are planned for future work.
Description
This contribution is part of the conference proceedings, which are licensed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)
Version
Published version
Access right on openHSU
Open access
