Deep Q-learning agent which finds a path to the goal in a grid world. This exercise was done as a coursework for course C424 at Imperial College London.
Dependencies: numpy
, cv2
, torch
To start the training run:
python train_and_test.py
Following techniques were used:
- Deep Q-network (created in PyTorch)
- Models with continous (
agent_radians.py
) and discrete actions (agent.py
) - Prioritised experience replay buffer
- Epsilon greedy policy
- Target network
- Sampling using Cross Entropy Method