Flappy-Bird_DDQN

I have used DDQN algorithm to train the Flappy Bird. After training for 24 hours, it's average score was 84(average is taken over last ten steps). It was able to achieve a max score of 384. When I stopped it's training at that time it's average score was increasing. I have used some tweaks to make the algorithm learn faster. I have kept the background black. Used Biassed greedy policy to gain reward etc. I have used a low spec laptop for its training. That's why it took 5 hours to beat the human average and 12 hours to have an average score above 45. If you have access to a high-end machine. I strongly encourage you to run this algorithm because you can get a sense of hyperparameter through this. This problem shows immediate effect of change of hyperparameter relative to other RL problems.

Saved model file contains parameters after the training of 5 hours.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Flappy-Bird_DDQN

Fun video of result after random hours of training

Files

README.md

Latest commit

History

README.md

File metadata and controls

Flappy-Bird_DDQN

Fun video of result after random hours of training