Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD

Lee, Donghwan; Yoon, Hyungjin; Hovakimyan, Naira

Mathematics > Optimization and Control

arXiv:1803.08031 (math)

[Submitted on 21 Mar 2018 (v1), last revised 22 Aug 2018 (this version, v2)]

Title:Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD

Authors:Donghwan Lee, Hyungjin Yoon, Naira Hovakimyan

View PDF

Abstract:The goal of this paper is to study a distributed version of the gradient temporal-difference (GTD) learning algorithm for multi-agent Markov decision processes (MDPs). The temporal difference (TD) learning is a reinforcement learning (RL) algorithm which learns an infinite horizon discounted cost function (or value function) for a given fixed policy without the model knowledge. In the distributed RL case each agent receives local reward through a local processing. Information exchange over sparse communication network allows the agents to learn the global value function corresponding to a global reward, which is a sum of local rewards. In this paper, the problem is converted into a constrained convex optimization problem with a consensus constraint. Then, we propose a primal-dual distributed GTD algorithm and prove that it almost surely converges to a set of stationary points of the optimization problem.

Comments:	Submitted to CDC2018
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:1803.08031 [math.OC]
	(or arXiv:1803.08031v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.1803.08031

Submission history

From: Donghwan Lee [view email]
[v1] Wed, 21 Mar 2018 17:46:45 UTC (460 KB)
[v2] Wed, 22 Aug 2018 13:03:18 UTC (573 KB)

Mathematics > Optimization and Control

Title:Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators