On the convex formulations of robust Markov decision processes

Grand-Clément, Julien; Petrik, Marek

Mathematics > Optimization and Control

arXiv:2209.10187 (math)

[Submitted on 21 Sep 2022 (v1), last revised 13 Dec 2023 (this version, v2)]

Title:On the convex formulations of robust Markov decision processes

Authors:Julien Grand-Clément, Marek Petrik

View PDF HTML (experimental)

Abstract:Robust Markov decision processes (MDPs) are used for applications of dynamic optimization in uncertain environments and have been studied extensively. Many of the main properties and algorithms of MDPs, such as value iteration and policy iteration, extend directly to RMDPs. Surprisingly, there is no known analog of the MDP convex optimization formulation for solving RMDPs. This work describes the first convex optimization formulation of RMDPs under the classical sa-rectangularity and s-rectangularity assumptions. By using entropic regularization and exponential change of variables, we derive a convex formulation with a number of variables and constraints polynomial in the number of states and actions, but with large coefficients in the constraints. We further simplify the formulation for RMDPs with polyhedral, ellipsoidal, or entropy-based uncertainty sets, showing that, in these cases, RMDPs can be reformulated as conic programs based on exponential cones, quadratic cones, and non-negative orthants. Our work opens a new research direction for RMDPs and can serve as a first step toward obtaining a tractable convex formulation of RMDPs.

Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG)
Cite as:	arXiv:2209.10187 [math.OC]
	(or arXiv:2209.10187v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2209.10187

Submission history

From: Marek Petrik [view email]
[v1] Wed, 21 Sep 2022 08:39:02 UTC (5,654 KB)
[v2] Wed, 13 Dec 2023 14:40:03 UTC (10,222 KB)

Mathematics > Optimization and Control

Title:On the convex formulations of robust Markov decision processes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:On the convex formulations of robust Markov decision processes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators