This repository contains our solutions to the assignment problems of the course "CS420/414 : Reinforcement Learning" offered by Dr. Prabuchandran K J at IIT Dharwad
-
Assignment 2 : Bandit Algorithms
- Implemented epsilon-greedy, variable epsilon-greedy, Softmax, Upper Confidence Bound (UCB) and Thompson sampling algorithms for Bernoulli and Normal reward setting.
-
Assignment 3 : Value Based Methods
- A classical maze problem was considered and policy iteration and value iteration were used to solve the problem.
-
Assignment 4 : Sample Based Monte-Carlo and Temporal Difference Methods
- Implemented Every Visit Monte-Carlo, Q-learning and SARSA agents for classical maze and Mountain Car environment.
-
Assignment 5 : Temporal Difference methods with function approximation and Reinforce algorithm.
- Implemented Q-learning, SARSA with Tile Coding and Radial basis function approximation methods, and Reinforce with and without baseline for Cart Pole and Mountain Car environment.
-
Mini Project : Policy Gradient Algorithms for Atari games
- Trained Ray rllib A2C, A3C and PPO agents for Pong, Breakout and Space Invaders atari environments and compared their results along with expalination of each algorithm in the report.