Improved Algorithms for Misspecified Linear Markov Decision Processes

Vial, Daniel; Parulekar, Advait; Shakkottai, Sanjay; Srikant, R.

Computer Science > Machine Learning

arXiv:2109.05546 (cs)

[Submitted on 12 Sep 2021 (v1), last revised 19 Oct 2021 (this version, v2)]

Title:Improved Algorithms for Misspecified Linear Markov Decision Processes

Authors:Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

View PDF

Abstract:For the misspecified linear Markov decision process (MLMDP) model of Jin et al. [2020], we propose an algorithm with three desirable properties. (P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance. (P2) Its space and per-episode time complexities remain bounded as $K \rightarrow \infty$. (P3) It does not require $\varepsilon_{\text{mis}}$ as input. To our knowledge, this is the first algorithm satisfying all three properties. For concrete choices of $\varepsilon_{\text{tol}}$, we also improve existing regret bounds (up to log factors) while achieving either (P2) or (P3) (existing algorithms satisfy neither). At a high level, our algorithm generalizes (to MLMDPs) and refines the Sup-Lin-UCB algorithm, which Takemura et al. [2021] recently showed satisfies (P3) for contextual bandits. We also provide an intuitive interpretation of their result, which informs the design of our algorithm.

Comments:	This version adds an intuitive explanation in Section 3
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2109.05546 [cs.LG]
	(or arXiv:2109.05546v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.05546
Journal reference:	International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

Submission history

From: Daniel Vial [view email]
[v1] Sun, 12 Sep 2021 16:02:32 UTC (30 KB)
[v2] Tue, 19 Oct 2021 14:12:33 UTC (34 KB)

Computer Science > Machine Learning

Title:Improved Algorithms for Misspecified Linear Markov Decision Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improved Algorithms for Misspecified Linear Markov Decision Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators