Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)

Piazza, Nancirose; Behzadan, Vahid

Computer Science > Machine Learning

arXiv:2109.14678 (cs)

[Submitted on 29 Sep 2021]

Title:Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)

Authors:Nancirose Piazza, Vahid Behzadan

View PDF

Abstract:Deep reinforcement learning (DRL) policies are vulnerable to unauthorized replication attacks, where an adversary exploits imitation learning to reproduce target policies from observed behavior. In this paper, we propose Constrained Randomization of Policy (CRoP) as a mitigation technique against such attacks. CRoP induces the execution of sub-optimal actions at random under performance loss constraints. We present a parametric analysis of CRoP, address the optimality of CRoP, and establish theoretical bounds on the adversarial budget and the expectation of loss. Furthermore, we report the experimental evaluation of CRoP in Atari environments under adversarial imitation, which demonstrate the efficacy and feasibility of our proposed method against policy replication attacks.

Comments:	5 pages not including references; 7 figures; more figures in supplements
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2109.14678 [cs.LG]
	(or arXiv:2109.14678v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.14678

Submission history

From: Nancirose Piazza [view email]
[v1] Wed, 29 Sep 2021 19:29:10 UTC (2,989 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Vahid Behzadan

export BibTeX citation

Computer Science > Machine Learning

Title:Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators