The goal of this project is to explore the intrinsically distributed qualities of Elixir for implementing real world Reinforcement Learning environments.
At this moment, this repository contains ad hoc implementations of environments and interacting agents. Initial abstractions are already stablished, so higher level programs like training procedures can seamesly be integrated with particular environment, agents, and learning strategies.
Environments in Gyx
can be implemented by using Env
behaviour.
A wrapper environment module for calling OpenAI Gym environments can be found in Gyx.Environments.Gym
NOTE: Gym library must be installed. You can do it by yourself or use the
Dockerfile
on this repo for developlment purposes. Just rundocker build -t gyx ./
on this directory, thendocker run -it gyx bash
will allow you to have everything set up, runiex -S mix
and start playing.
For a Gym environment to be used, it is necessary to initialize the Gyx
process to a particular environment by calling make/1
iex(1)> Gyx.Environments.Gym.start_link [], name: :gym
Named process :gym
can now be associated with a particular gym environment
iex(2)> Gyx.Environments.Gym.make :gym, "Blackjack-v0"
Environment interactions are performed through step
, getting an experience back
iex(3)> Gyx.Environments.Gym.step :gym, 1
%Gyx.Core.Exp{
action: 1,
done: false,
info: %{gym_info: {:"$erlport.opaque", :python, <<128, 2, 125, 113, 0, 46>>}},
next_state: {20, 7, false},
reward: 0.0,
state: {13, 7, false}
}
Environment processes IDs can be used directly
iex(4)> alias Gyx.Environments.Gym
iex(5)> {:ok, gym_proc} = Gym.start_link [], []
iex(6)> Gym.make gym_proc, "SpaceInvaders-v0"
It is possible to render the screen for Gym based environments with Gyx.Environments.Gym.render
which relies on the internal Python Gym render method, alternatively, the screen can be rendered directly on the terminal.
iex(7)> Gym.render gym_proc, :terminal, scale: 0.9
Any Environment contains action and observation space definitions, which can be used to sample random actions and observations
iex(7)> action_space = :sys.get_state(gym_proc).action_space
%Gyx.Core.Spaces.Discrete{n: 6, random_algorithm: :explus, seed: {1, 2, 3}}
iex(8)> Gyx.Core.Spaces.sample action_space
{:ok, 4}