Octo

This repo contains code for training and finetuning Octo generalist robotic policies (GRPs). Octo models are transformer-based diffusion policies, trained on a diverse mix of 800k robot trajectories.

Get Started

Follow the installation instructions, then load a pretrained Octo model! See examples for guides to zero-shot evaluation and finetuning and for an inference example.

from octo.model.octo_model import OctoModel
model = OctoModel.load_pretrained("hf://rail-berkeley/octo-base")
print(model.get_pretty_spec())

Out of the box, Octo supports multiple RGB camera inputs, can control various robot arms, and can be instructed via language commands or goal images. Octo uses a modular attention structure in its transformer backbone, allowing it to be effectively finetuned to robot setups with new sensory inputs, action spaces, and morphologies, using only a small target domain dataset and accessible compute budgets.

Installation

conda create -n octo python=3.10
conda activate octo
pip install -e .
pip install -r requirements.txt

For GPU:

pip install --upgrade "jax[cuda11_pip]==0.4.20" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

For TPU

pip install --upgrade "jax[tpu]==0.4.20" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

See the Jax Github page for more details on installing Jax.

Test the installation by finetuning on the debug dataset:

python scripts/finetune.py --config.pretrained_path=hf://rail-berkeley/octo-small --debug

Checkpoints

You can find pretrained Octo checkpoints here. At the moment we provide the following model versions:

Model	Inference on 1x NVIDIA 4090	Size
Octo-Base	13 it/sec	93M Params
Octo-Small	17 it/sec	27M Params

Examples

We provide simple example scripts that demonstrate how to use and finetune Octo models, as well as how to use our data loader independently. We provide the following examples:


Octo Inference	Minimal example for loading and running a pretrained Octo model
Octo Finetuning	Minimal example for finetuning a pretrained Octo models on a small dataset with a new observation and action space
Octo Rollout	Run a rollout of a pretrained Octo policy in a Gym environment
Octo Robot Eval	Evaluate a pretrained Octo model on a real WidowX robot
OpenX Dataloader Intro	Walkthrough of the features of our Open X-Embodiment data loader
OpenX PyTorch Dataloader	Standalone Open X-Embodiment data loader in PyTorch

Octo Pretraining

To reproduce our Octo pretraining on 800k robot trajectories, run:

python scripts/train.py --config scripts/configs/octo_pretrain_config.py:<size> --name=octo --config.dataset_kwargs.oxe_kwargs.data_dir=... --config.dataset_kwargs.oxe_kwargs.data_mix=oxe_magic_soup ...

To download the pretraining dataset from the Open X-Embodiment Dataset, install the rlds_dataset_mod package and run the prepare_open_x.sh script. The total size of the pre-processed dataset is ~1.2TB.

We run pretraining using a TPUv4-128 pod in 8 hours for the Octo-S model and in 14 hours for Octo-B.

Octo Finetuning

We provide a minimal example for finetuning with a new observation and action space.

We also provide a more advanced finetuning script that allows you to change hyperparameters via a config file and logs finetuning metrics. To run advanced finetuning, use:

python scripts/finetune.py --config.pretrained_path=hf://rail-berkeley/octo-small

We offer three finetuning modes depending on the parts of the model that are kept frozen: head_only, head_mlp_only, and full to finetune the full model. Additionally, one can specify the task type to finetune with: image_conditioned, language_conditioned or multimodal for both. For example, to finetune the full transformer with image inputs only use: --config=finetune_config.py:full,image_conditioned.

Octo Evaluation

Loading and running a trained Octo model is as easy as:

from octo.model import OctoModel

model = OctoModel.load_pretrained("hf://rail-berkeley/octo-small")
task = model.create_tasks(texts=["pick up the spoon"])
action = model.sample_action(observation, task, rng=jax.random.PRNGKey(0))

We provide examples for evaluating Octo in a simulated Gym environment as well as on a real WidowX robot.

To evaluate on your own environment, simply wrap it in a Gym interface and follow the instructions in the Eval Env README.

Code Structure

	File	Description
Hyperparameters	config.py	Defines all hyperparameters for the training run.
Pretraining Loop	train.py	Main pretraining script.
Finetuning Loop	finetune.py	Main finetuning script.
Datasets	dataset.py	Functions for creating single / interleaved datasets + data augmentation.
Tokenizers	tokenizers.py	Tokenizers that encode image / text inputs into tokens.
Octo Model	octo_model.py	Main entry point for interacting with Octo models: loading, saving, and inference.
Model Architecture	octo_module.py	Combines token sequencing, transformer backbone and readout heads.
Visualization	visualization_lib.py	Utilities for offline qualitative & quantitative eval.

Citation

@misc{octo_2023,
    title={Octo: An Open-Source Generalist Robot Policy},
    author = {{Octo Model Team} and Dibya Ghosh and Homer Walke and Karl Pertsch and Kevin Black and Oier Mees and Sudeep Dasari and Joey Hejna and Charles Xu and Jianlan Luo and Tobias Kreiman and {You Liang} Tan and Dorsa Sadigh and Chelsea Finn and Sergey Levine},
    howpublished  = {\url{https://octo-models.github.io}},
    year = {2023},
}

Name		Name	Last commit message	Last commit date
Latest commit History 937 Commits
.github/workflows		.github/workflows
docs/assets		docs/assets
examples		examples
octo		octo
scripts		scripts
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Octo

Get Started

Installation

Checkpoints

Examples

Octo Pretraining

Octo Finetuning

Octo Evaluation

Code Structure

Citation

About

Releases

Packages

Languages

License

UM-ARM-Lab/octo

Folders and files

Latest commit

History

Repository files navigation

Octo

Get Started

Installation

Checkpoints

Examples

Octo Pretraining

Octo Finetuning

Octo Evaluation

Code Structure

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages