Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Chen, Zhaorun; Chen, Binhao; Xie, Shenghan; Gong, Liang; Liu, Chengliang; Zhang, Zhengfeng; Zhang, Junping

Computer Science > Machine Learning

arXiv:2109.13005 (cs)

[Submitted on 27 Sep 2021]

Title:Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Authors:Zhaorun Chen, Binhao Chen, Shenghan Xie, Liang Gong, Chengliang Liu, Zhengfeng Zhang, Junping Zhang

View PDF

Abstract:In complex environments with high dimension, training a reinforcement learning (RL) model from scratch often suffers from lengthy and tedious collection of agent-environment interactions. Instead, leveraging expert demonstration to guide RL agent can boost sample efficiency and improve final convergence. In order to better integrate expert prior with on-policy RL models, we propose a generic framework for Learning from Demonstration (LfD) based on actor-critic algorithms. Technically, we first employ K-Means clustering to evaluate the similarity of sampled exploration with demonstration data. Then we increase the likelihood of actions in similar frames by modifying the gradient update strategy to leverage demonstration. We conduct experiments on 4 standard benchmark environments in Mujoco and 2 self-designed robotic environments. Results show that, under certain condition, our algorithm can improve sample efficiency by 20% ~ 40%. By combining our framework with on-policy algorithms, RL models can accelerate convergence and obtain better final mean episode rewards especially in complex robotic context where interactions are expensive.

Comments:	This paper is accepted at The 3rd International Symposium on Robotics & Intelligent Manufacturing Technology (ISRIMT 2021) (this https URL) and nominated as the "best paper"
Subjects:	Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Robotics (cs.RO); Systems and Control (eess.SY)
Cite as:	arXiv:2109.13005 [cs.LG]
	(or arXiv:2109.13005v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.13005

Submission history

From: Zhaorun Chen [view email]
[v1] Mon, 27 Sep 2021 12:42:05 UTC (489 KB)

Computer Science > Machine Learning

Title:Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators