Current implementations
- UCB1
- LinUCB
simulation example:
Change Directory to example/scripts and type ./simulation.sh
in command line.
./simulation.sh
Experiment can be done by changing arguments and Envs in simulation.sh and example/envs/envs.py, example/envs/bandits.py
Original work reference bgalbraith
- Lihong Li, Wei Chu, John Langford, Robert E. Schapire, A Contextual-Bandit Approach to Personalized News Article Recommendation
- Dai Shi, Exploring Bandit Algorithms for Automatic Content Selection
- Olivier Chapelle, Lihong Li, An Empirical Evaluation of Thompson Sampling
- Junpei Komiyama, Junya Honda, Hiroshi Nakagawa, Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays
- Piette JD, The potential impact of intelligent systems for mobile health self-management support: Monte Carlo simulations of text message support for medication adherence
- Contextual bandit models for personalized recommendation
- Caltech/cs159/linear-bandit-supplement
- Caltech/cs159/LinUCB
- Contextual Bandits and Exp4 algorithm
- Simple Reinforcement Learning with Tensorflow Part 1.5: Contextual Bandits
- Thompson Sampling
- Exp3,4