A collection of mobile robot environments and their goal-conditioned reinforcement learning controllers.
Install the package via pip:
git clone https://github.com/ZikangXiong/mobrob
cd mobrob
pip install -e .
This project partially relies on mujoco-py. Follow the official guide to setup. mujoco-py depends on glew, mesalib, and glfw3. If you do not have permission to install these dependencies, you may use conda to circumvent this issue:
conda install -c conda-forge glew conda install -c conda-forge mesalib conda install -c menpo glfw3
This repository provides five mobile robot environments:
Body Type | Description | Simulator | State dim | Action dim | Control type | Video |
---|---|---|---|---|---|---|
point | point mass | mujoco-py | 14 | 2 | Continuous Commands | point.mp4 |
car | car-like kinematics | mujoco-py | 26 | 2 | Continuous Commands | car.mp4 |
doggo | quadruped dog kinematics | mujoco-py | 58 | 12 | Continuous Commands | doggo.mp4 |
drone | drone kinematics | pybullet | 12 | 18 | Neural PID | drone.mp4 |
turtlebot3 | turtlebot3-waffle kinematics | pybullet | 43 | 2 | Neural Prop | turtlebot3.mp4 |
Continuous Commands: continuous control commands are generated by the control policy directly.
Neural PID: a neural network maps the current state to the desired PID coefficients.
Neural Prop: a neural network maps the current state to the desired proportional control coefficients.
Controllers are trained using Proximal Policy Optimization (PPO).
- Pretrained policies: Available at data/policies.
- Training parameters: Available at data/configs. Refer to stable-baselines3 PPO for all supported parameters.
- Training: Use scripts in examples/train.py. For instance, to train the point robot:
python examples/train.py --env-name point
To finetune a trained policy:
python examples/train.py --env-name point --finetune
Training logs and intermediate policies are saved in data/tmp
.
- Evaluation: Use scripts in examples/control.py. For instance, to evaluate the point robot:
python examples/control.py --env-name point
To disable the GUI in case you are running the code on a remote server:
python examples/control.py --env-name point --no-gui
or you can consider use pyvirtualdisplay, and store the video.
For users intending to build their goal-conditioned environments, the following abstract functions in the abstract EnvWrapper
(in wrapper.py) should be rewritten according to the specific needs of the new robot environment. The functions, along with their brief explanations, are given in the table below:
Function Name | Description |
---|---|
_set_goal(self, goal) |
Sets the goal position of the robot. Example: [x, y, z] |
build_env(self) |
Constructs the environment, i.e., loads the robot and the world. |
get_pos(self) |
Retrieves the current position of the robot. Example: [x, y, z] |
set_pos(self, pos) |
Sets the position of the robot. Example: [x, y, z] |
get_obs(self) |
Obtains the current observation of the robot. Example: [x, y, z, r, p, y] |
get_observation_space(self) |
Gets the observation space of the robot. Example: Box(58,) |
get_action_space(self) |
Retrieves the action space of the robot. Example: Box(12,) |
get_init_space(self) |
Fetches the initial space of the robot. Example: Box(3,) |
get_goal_space(self) |
Acquires the goal space of the robot. Example: Box(3,) |
One may refer to the other robot environment wrappers in wrapper.py for more details.
This repository is used in the following papers as the benchmark environment:
@inproceedings{mfnlc,
author={Xiong, Zikang and Eappen, Joe and Qureshi, Ahmed H. and Jagannathan, Suresh},
booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Model-free Neural Lyapunov Control for Safe Robot Navigation},
year={2022},
pages={5572-5579},
doi={10.1109/IROS47612.2022.9981632}}
@article{dscrl,
title={Co-learning Planning and Control Policies Using Differentiable Formal Task Constraints},
author={Xiong, Zikang and Eappen, Joe and Lawson, Daniel and Qureshi, Ahmed H and Jagannathan, Suresh},
journal={arXiv preprint arXiv:2303.01346},
year={2023}
}