Simple implementations of RL algorithms.
PPO: A simple PPO implementation using pytorch both for continuous and discrete action spaces.
SAC: SAC for continuous action spaces, tested on hopper-v4 and pendulum-v1.
TD3: TD3 implementation from scratch using pytorch, and tested on HalfCheetah and Pendulum envs.
DDPG: Deep Deterministic Policy Gradient implementation.
DQN: DQN implementation using pytorch. I used the pytorch documentation in RL section with some small changes and a different environment. using both ReLU and Fuzzy Tiling Activations(FTA)
Tabular: Implementations of tabular algorithms from "Reinforcement Learning: an introduction" tested on different gridworlds or gym environments.