Forward model

Previous notebooks and experiments can be found here.

Experiments and models for my masters thesis on learning environment dynamics from observations.

The model in action

Diagram of the Asset Spatial Transformer RNN

Structure of the project

src
  data
  loggers           (implementation of the wnb logger)
  models            (implementations of models)
  pipelines
    train           (training module)
    eval            (evaluation module)
    train_eval_all  (script for running train and eval on all the configurations)
    config          (configuration of all the runs)
  scrips            (scrips for visualization)
  utils             (generic utilities and modules)

How to run

You will need Python >= 3.6

Install the requirements:

pip install -r requirements.txt

Execute:

wandb login

to log in wandb and be able to view the training results.

Create the following folders in the root directory:

.reports
.models
.results

Run train and eval for all of the configurations setup in src/pipelines/config - all of the models described in the thesis.

python -m src.pipeline.train_eval_all

Observe the results in the generated wnb project.

Visualizations

Visualize live model reconstruction with:

python -m src.scripts.play_model

set the desired model configuration at the top of the file - as a get_hparams parameter.

Visualize live Asset Spatial RNN model:

python -m src.scripts.animate_asset_model

Visualization of a sweep of 16 train runs

Examples of reconstructed PONG episodes

Notes and tasks

Profiling code
- pip install profiling
- profiling live-profile -m src.pipeline.train -- --debug
General stuff
- Mask out empty (padded) frames after rollout has finished. See here.
- Label smoothing. Do I actually want that?
Models
- RNN Deconvolution Baseline
- Learn frame transformations
  - Instead of compressing the state like the RNN does
  - Action + Precondition (last few frames) -> transformation vector T
  - Use T to transform the current frame to the future frame
  - Play rollout of frame transformations - results in wandb look promising
Notes
- 12.06.2020
  - Update implementation of RNN Deconv
  - Focus on making RNN deconv work on PONG
    - WHY RNN Deconv - it is the only model that can model PONG with the current setup of the data pipeline.
    - Frame transforming models need two frames as context
    - TODO: [ ] Train and save working RNN Deconv model [ ] Write playing script [ ] Write script for manipulating the latent RNN state and viewing the result?
- 06.06.2020
  - Implement pong agent class + action mappings ([3, 3] => 9)
  - Make RNN Playable (interface like a gym)
- 04.06.2020
  - [BUGFIX] Found major bug in RNN models - the pred frames and true frames were not aligned, the model was trying to predict the present from the present
  - [BUGFIX] TimeDistributed (decorator) module was not holding the wrapped module in it's state resulting in the parameters of the wrapped module not being part of the overall model, resulting in the model not being able to be trined. (Took quite some time)
  - [FEATURE] Implemented generic multiprocessing function spawner and random agent rollout generator that leads to newer rollouts in the training buffer faster. Hopefully this can reduce over-fitting.

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
assets		assets
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forward model

Structure of the project

How to run

Visualizations

Notes and tasks

About

Contributors 2

Languages

ichko/forward-model

Folders and files

Latest commit

History

Repository files navigation

Forward model

Structure of the project

How to run

Visualizations

Notes and tasks

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages