This paper introduces a novel framework that bridges advanced reinforcement learning (RL) with traditional PID control by converting model-based RL policies into interpretable PID gains. By combining inverse reinforcement learning (IRL) with Kullback-Leibler divergence minimization, our method aligns sophisticated control strategies with the simplicity and robustness of PID controllers. In doing so, the proposed approach maintains the transparency and simplicity of PID controllers while incorporating the adaptability, data-driven optimization, and long-horizon planning capabilities of RL. Compatible with both model-based and model-free RL algorithms, the approach has been validated through extensive simulations on benchmark systems and real-world experiments on the Robotarium platform, demonstrating resilience against disturbances, parameter uncertainties, and noise. By blending the strengths of reinforcement learning with the practical familiarity of PID control, the proposed framework offers a data-efficient, scalable, and transparent solution for enhancing PID controller design in complex and dynamic environments. Note to Practitioners - PID controllers remain widely used in automation and robotics due to their simplicity and reliability, yet tuning their gains for nonlinear or uncertain systems is often time-consuming and application-specific. This work presents a practical, data-driven approach for improving PID performance without altering the familiar controller structure. Instead of manual tuning or hand-crafted cost design, PID gains are learned directly from demonstration data generated by reinforcement learning or expert policies, enabling desirable behaviors such as stabilization, disturbance rejection, and robustness to uncertainty to be transferred automatically. The method requires only trajectory data and integrates easily with existing control pipelines, making it suitable when accurate models are unavailable or rapid retuning is needed. Because the final controller remains a standard PID law, it retains low computational overhead, interpretability, and compatibility with industrial hardware. The approach, therefore, offers a plug-and-play mechanism for upgrading conventional PID control in real-world automation systems.
A Novel PID Design Method via Model-Based Reinforcement Learning Algorithms
Jesawada, Hozefa;Yerudkar, Amol;Singh, Navdeep;Del Vecchio, Carmen
2026-01-01
Abstract
This paper introduces a novel framework that bridges advanced reinforcement learning (RL) with traditional PID control by converting model-based RL policies into interpretable PID gains. By combining inverse reinforcement learning (IRL) with Kullback-Leibler divergence minimization, our method aligns sophisticated control strategies with the simplicity and robustness of PID controllers. In doing so, the proposed approach maintains the transparency and simplicity of PID controllers while incorporating the adaptability, data-driven optimization, and long-horizon planning capabilities of RL. Compatible with both model-based and model-free RL algorithms, the approach has been validated through extensive simulations on benchmark systems and real-world experiments on the Robotarium platform, demonstrating resilience against disturbances, parameter uncertainties, and noise. By blending the strengths of reinforcement learning with the practical familiarity of PID control, the proposed framework offers a data-efficient, scalable, and transparent solution for enhancing PID controller design in complex and dynamic environments. Note to Practitioners - PID controllers remain widely used in automation and robotics due to their simplicity and reliability, yet tuning their gains for nonlinear or uncertain systems is often time-consuming and application-specific. This work presents a practical, data-driven approach for improving PID performance without altering the familiar controller structure. Instead of manual tuning or hand-crafted cost design, PID gains are learned directly from demonstration data generated by reinforcement learning or expert policies, enabling desirable behaviors such as stabilization, disturbance rejection, and robustness to uncertainty to be transferred automatically. The method requires only trajectory data and integrates easily with existing control pipelines, making it suitable when accurate models are unavailable or rapid retuning is needed. Because the final controller remains a standard PID law, it retains low computational overhead, interpretability, and compatibility with industrial hardware. The approach, therefore, offers a plug-and-play mechanism for upgrading conventional PID control in real-world automation systems.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


