Previous notebooks and experiments can be found here.
Experiments and models for my masters thesis on learning environment dynamics from observations.
src
data
loggers (implementation of the wnb logger)
models (implementations of models)
pipelines
train (training module)
eval (evaluation module)
train_eval_all (script for running train and eval on all the configurations)
config (configuration of all the runs)
scrips (scrips for visualization)
utils (generic utilities and modules)
You will need Python >= 3.6
- Install the requirements:
pip install -r requirements.txt
- Execute:
wandb login
to log in wandb and be able to view the training results.
- Create the following folders in the root directory:
.reports
.models
.results
- Run train and eval for all of the configurations setup in
src/pipelines/config
- all of the models described in the thesis.
python -m src.pipeline.train_eval_all
Observe the results in the generated wnb project.
Visualize live model reconstruction with:
python -m src.scripts.play_model
set the desired model configuration at the top of the file - as a get_hparams
parameter.
Visualize live Asset Spatial RNN model:
python -m src.scripts.animate_asset_model
Visualization of a sweep of 16 train runs
Examples of reconstructed PONG episodes
-
pip install profiling
profiling live-profile -m src.pipeline.train -- --debug
-
General stuff
- Mask out empty (padded) frames after rollout has finished. See here.
- Label smoothing. Do I actually want that?
-
Models
- RNN Deconvolution Baseline
- Learn frame transformations
- Instead of compressing the state like the RNN does
- Action + Precondition (last few frames) -> transformation vector T
- Use T to transform the current frame to the future frame
- Play rollout of frame transformations - results in wandb look promising
-
Notes
-
12.06.2020
- Update implementation of RNN Deconv
- Focus on making RNN deconv work on PONG
- WHY RNN Deconv - it is the only model that can model PONG with the current setup of the data pipeline.
- Frame transforming models need two frames as context
- TODO: [ ] Train and save working RNN Deconv model [ ] Write playing script [ ] Write script for manipulating the latent RNN state and viewing the result?
-
06.06.2020
- Implement pong agent class + action mappings ([3, 3] => 9)
- Make RNN Playable (interface like a gym)
-
04.06.2020
- [BUGFIX] Found major bug in RNN models - the pred frames and true frames were not aligned, the model was trying to predict the present from the present
- [BUGFIX] TimeDistributed (decorator) module was not holding the wrapped module in it's state resulting in the parameters of the wrapped module not being part of the overall model, resulting in the model not being able to be trined. (Took quite some time)
- [FEATURE] Implemented
generic multiprocessing
function spawner andrandom agent rollout generator
that leads to newer rollouts in the training buffer faster. Hopefully this can reduce over-fitting.
-