Skip to content

CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

License

Notifications You must be signed in to change notification settings

ai4ce/CityWalker

Repository files navigation

CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos

TL;DR: CityWalker leverages thousands of hours of online city walking and driving videos to train autonomous agents for robust, generalizable navigation in dynamic urban environments through scalable, data-driven imitation learning.

Xinhao Liu*, Jintong Li*, Yicheng Jiang, Niranjan Sujay, Zhicheng Ynag, Juexiao Zhang, John Abanes, Jing Zhang, Chen Feng

Getting Started

Installation

The project should be compatible with latest Pytorch and CUDA versions. The code is tested with Python 3.11, PyTorch 2.5.0, and CUDA 12.1. To install the dependencies, run:

conda env create -f environment.yml
conda activate citywalker

Data Preparation

Please see dataset/README.md for details on how to prepare the dataset.

Training

To train the model, run:

python train.py --config configs/citywalk_2000hr.yaml

We provide our pretrained model in the releases tab.

Fine-tuning

To fine-tune the model, run:

python fine_tune.py --config configs/citywalk_2000hr.yaml --checkpoint <path_to_checkpoint>

Testing

To test the model, run:

python test.py --config configs/citywalk_2000hr.yaml --checkpoint <path_to_checkpoint>

Citation

Coming soon

Acknowledgements

The work was supported by NSF grants 2238968, 2121391, 2322242 and 2345139; and in part through the NYU IT High Performance Computing resources, services, and staff expertise. We thank Xingyu Liu and Zixuan Hu for their help in data collection.

We also thank the authors of the following repositories for their open-source implementations: