A data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study

IRIS

Dynamic programming (DP) and Markov Decision Process (MDP) offer powerful tools for formulating, modeling, and solving decision making problems under uncertainty. In real-world applications, the applicability of DP is limited by severe scalability issues. These issues can be addressed by Approximate Dynamic Programming (ADP) techniques. ADP methods are based on the assumption of having either a proper estimation of the underlying state transition probability distributions or a simulation mechanism with the capability of generating samples according to such probability distributions. In this paper, we present a data-driven ADP-based approach, which can offer an alternative in case such assumption cannot be guaranteed. In particular, when varying the set-up of the MDP state transition probability matrix, different policies can be calculated through exact DP or ADP methods. Such policies are then processed by an Apriori-based algorithm to find frequent association rules within them. A pruning procedure is used to select the most suitable association rules, and finally an Association Classifier infers the optimal policy in all the possible circumstances. We show a detailed application of the proposed approach for the calculation of a proper mission operations plan for spacecrafts with a high level of on-board autonomy.

A data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study

D'Angelo G.;Tipaldi M.;Palmieri F.;Glielmo L.

2019-01-01

Abstract

Dynamic programming (DP) and Markov Decision Process (MDP) offer powerful tools for formulating, modeling, and solving decision making problems under uncertainty. In real-world applications, the applicability of DP is limited by severe scalability issues. These issues can be addressed by Approximate Dynamic Programming (ADP) techniques. ADP methods are based on the assumption of having either a proper estimation of the underlying state transition probability distributions or a simulation mechanism with the capability of generating samples according to such probability distributions. In this paper, we present a data-driven ADP-based approach, which can offer an alternative in case such assumption cannot be guaranteed. In particular, when varying the set-up of the MDP state transition probability matrix, different policies can be calculated through exact DP or ADP methods. Such policies are then processed by an Apriori-based algorithm to find frequent association rules within them. A pruning procedure is used to select the most suitable association rules, and finally an Association Classifier infers the optimal policy in all the possible circumstances. We show a detailed application of the proposed approach for the calculation of a proper mission operations plan for spacecrafts with a high level of on-board autonomy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Parole chiave
	
				Approximate dynamic programming
Apriori classifier
Association classifier
Association rules
Markov decision process
Spacecraft autonomy
Stochastic optimal control
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12070/46189

Citazioni

ND

20

19

social impact