The rise of reinforcement learning (RL) has guided a new paradigm: unraveling the inter- vention strategies to control systems with unknown dynamics. Model-free RL provides an exhaustive framework to devise therapeutic methods to alter the regulatory dynamics of gene regulatory networks (GRNs). This paper presents an RL-based technique to control GRNs modeled as probabilistic Boolean control networks (PBCNs). In particular, a double deep-Q network (DDQN) approach is proposed to address the sampled-data control (SDC) problem of PBCNs, and optimal state feedback controllers are obtained, rendering the PBCNs stabilized at a given equilibrium point. Our approach is based on options, i.e., the temporal abstractions of control actions in the Markov decision processes (MDPs) framework. First, we define options and hierarchical options and give their properties. Then, we introduce multi-time models to compute the optimal policies leveraging the options framework. Furthermore, we present a DDQN algorithm: i) to concurrently design the feedback controller and the sampling period; ii) wherein the controller intelligently decides the sampled period to update the control actions under the SDC scheme. The pre- sented method is model-free and offers scalability, thereby providing an efficient way to control large-scale PBCNs. Finally, we compare our control policy with state-of-the-art con- trol techniques and validate the presented results.
Sampled-data Control of Probabilistic Boolean Control Networks: A Deep Reinforcement Learning Approach
Yerudkar, Amol;Del Vecchio, Carmen;
2023-01-01
Abstract
The rise of reinforcement learning (RL) has guided a new paradigm: unraveling the inter- vention strategies to control systems with unknown dynamics. Model-free RL provides an exhaustive framework to devise therapeutic methods to alter the regulatory dynamics of gene regulatory networks (GRNs). This paper presents an RL-based technique to control GRNs modeled as probabilistic Boolean control networks (PBCNs). In particular, a double deep-Q network (DDQN) approach is proposed to address the sampled-data control (SDC) problem of PBCNs, and optimal state feedback controllers are obtained, rendering the PBCNs stabilized at a given equilibrium point. Our approach is based on options, i.e., the temporal abstractions of control actions in the Markov decision processes (MDPs) framework. First, we define options and hierarchical options and give their properties. Then, we introduce multi-time models to compute the optimal policies leveraging the options framework. Furthermore, we present a DDQN algorithm: i) to concurrently design the feedback controller and the sampling period; ii) wherein the controller intelligently decides the sampled period to update the control actions under the SDC scheme. The pre- sented method is model-free and offers scalability, thereby providing an efficient way to control large-scale PBCNs. Finally, we compare our control policy with state-of-the-art con- trol techniques and validate the presented results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.