School of Information Science and Technology offers this course in Fall 2021. This course does not teach purely mathematical knowledge about probability and statistics. The instructor, Prof. Ziyu Shao, also teaches some connections between probability and information science.
This project is about Multi-armed Bandit, a classical problem in reinforcement learning. The project generally contains three main parts:
- Simulation. We have to simulate the multi-armed bandit via different algorithms under the set of given parameters and show the output, i.e., the performance of the bandit.
- Analysis. We have to analyze the impact of each hyperparameter of each algorithm. Additionally, we have to analyze an important problem in the multi-armed bandit program, which is named the exploration-exploitation trade-off.
- Design. We have to handle some variants of the classical scenario, including dependency and constraint, and design some algorithms by ourselves to address the novel problem variants.
You can refer to the outline to learn about the background and requirements of this project, and the report to look through my project report.
The programming part of this project is implemented in Python. You can look into the project ipynb to get all the codes, and run simulation to reproduce the results.