Adversarial Attacks and Defenses in Reinforcement Learning

The aim of this project was to explore Adversarial Attacks and Defenses in Single as well as Multi-Agent Reinforcement Learning. In the Single-Agent domains, we focus on Pixel-Based attacks in Atari games from the Gym environments. In Multi-Agent, we concentrate on attacking by training Adversarial Policies in 1-vs-1 zero-sum continuous control robotic environments from the MuJoCo simulator. We also studied potential defense procedures to counter such attacks.

A detailed article about the methods and approaches studied during the project can be found here. We have also implemented some of these in this repository.

We also have a blog with articles on the several concepts involved in the project.

Structure

LearningPhaseAssignments contains the Reinforcement Learning algorithms implemented during the learning phase of the project. This includes:
- Tabular SARSA & Q-Learning
- Deep Q-Networks (DQN)
- Vanilla Policy Gradients (VPG/REINFORCE)
Adversarial-policies contains a Tensorflow implementation of the attack by training Adversarial policies.

Gleave et al., 2020

The implementation in this folder is structured as follows:
- agent-zoo: Contains the pre-trained agent parameters for the environments described in Bansal et al., 2018a. Source
- abstraction.py: A wrapper over the Multi-Agent environment (Two player Markov game) to use it as Single-Agent. It embeds the victim into the environment, with the adversarial agent taking actions, and receiving observations and reward signals.
- policy.py: Contains the implementation of MLP and LSTM network policies of the agents.
- train.py: Contains the code for training the adversarial policy using Proximal Policy Optmization (PPO).
- show.py: Contains the testing and video-making part.
- finallog.txt: Output logs from the training procedure.
- knd_results.txt: Attack accuracy (win percentage of the adversary) in the Kick-and-Defend environment.
- knd3.zip: Trained parameters for the adversarial policy in Kick-and-Defend.
- videos: Video displaying the adversarial attack in Kick-and-Defend.
FGSM-on-Images contains a PyTorch implementation of Pixel-based attacks on images and output plots and images with varying perturbations.
- fast_gradient_sign_method.py: Contains the implementation of the Fast Gradient Sign Method (FGSM) on the MNIST dataset.
Adversarial-attacks-on-DNN-policies: Contains a PyTorch implementation of the FGSM attack on Neural Network policies in Atari Pong environment.

Huang et al., 2017
- Adversarial-Attack: Contains the code, stats, and videos for L1, L2, Linf norm Adversarial attacks on Pong agents in WhiteBox as well as BlackBox conditions.
- Test: Code, stats, and videos for the Pong agent before the adversarial attack.
- Train: Code and videos for training a Pong agent using PPO.
- policy-zoo: Pre-trained policies used for the attacks.
- ppo2_pong.zip: Trained parameters for the Pong agent trained using PPO.

Requirements

PyTorch
Tensorflow (version 1.x)
Stable-Baselines (version 2.10.1a1)
MuJoCo 131

Team

Madhuparna Bhowmik
Akash Nair
Saurabh Agarwala
Videh Raj Nema
Kinshuk Kashyap
Manav Singhal

Mentor: Moksh Jain

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Adversarial-attacks-on-DNN-policies		Adversarial-attacks-on-DNN-policies
Adversarial-policies		Adversarial-policies
FGSM-on-images		FGSM-on-images
LearningPhaseAssignments @ 7a9d96d		LearningPhaseAssignments @ 7a9d96d
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Attacks and Defenses in Reinforcement Learning

Structure

Requirements

Team

About

Releases

Packages

Contributors 2

Languages

IEEE-NITK/Adversarial-Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

Adversarial Attacks and Defenses in Reinforcement Learning

Structure

Requirements

Team

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages