Skip to content

Chapter wise implementation & analysis of all the algorithms in RL : An Intoduction by Richard S. Sutton and Andrew G. Barto

License

Notifications You must be signed in to change notification settings

SanketAgrawal/ReinforcementLearning

Repository files navigation

ReinforcementLearning

Chapter wise implementation & analysis of all the algorithms in RL : An Intoduction by Richard S. Sutton and Andrew G. Barto

Chapter 2

The Notebook Greedy,e-Greedy,UCB,Gradient.ipynb demonstrates the working of following algorithms:

  1. Greedy Algorithm
  2. epsilon-Greedy Algorithm
  3. UCB
  4. Gradient Bandit

The notebook also shows the anlysis on the above algorithms with Optimistic Initial Values. Results shows that UCB outperforms all other algorithms in stationary K-armed Bandit problem.

Chapter 4

The notebook RL using Dynamic Programming.ipynb demonstrates the way of solving finite MDPs. Below mentioned alorithms are implmented:

  1. Policy Iteration with two arrays
  2. Policy Iteration using inplace update
  3. Value Iteration with two arrays
  4. Value Iteration using inplace updates

The results clearly shows that the Value Iteration with inplace updates converges faster then the other three algorithms.

About

Chapter wise implementation & analysis of all the algorithms in RL : An Intoduction by Richard S. Sutton and Andrew G. Barto

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published