stable-baselines with JAX & Haiku
-
Updated
Jun 20, 2024 - Python
stable-baselines with JAX & Haiku
Berkeley CS 294: Deep Reinforcement Learning
Lunar Lander game from OpenAI Gym using behavioral cloning, DAgger methods, and POMDP(Partially-Observable Markov Decision Processes)
Using DAgger with our MPC treated as the expert, we are able to effectively distill knowledge into relatively simple networks while still being able to retain a large fraction of the performance. (Please see paper for full description).
Add a description, image, and links to the dataset-aggregation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-aggregation topic, visit your repo's landing page and select "manage topics."