-
Notifications
You must be signed in to change notification settings - Fork 7
Deploying a New Model to NDIF
git clone https://github.com/ndif-team/ndif.git && cd ndif && git checkout dev
Download and install Miniconda by following the instructions on the official Miniconda page for Linux.
conda env create -f ../ndif/services/ray_head/environment.yml -n prod
conda activate prod
To ensure that the correct version of NNSight is present, do the following:
pip uninstall nnsight && git clone https://github.com/ndif-team/nnsight.git
pip install -e nnsight
cd nnsight && git checkout 0.3
In order for Ray to work, you need all nodes to have the same version of Ray installed. Here is what we have been using:
pip install ray[serve]==2.34
Choose a location for your huggingface cache (if you don't already have one)
touch .hf_config
Create a script named env.sh
in your working directory with the following content (make sure to modify your environment variables appropriately):
#! /bin/bash
huggingface-cli login --token hf-token
export PYTHONPATH=/path/to/ndif/services/ray_worker
export HF_HOME=/path/to/.hf_config
export RAY_ADDRESS=head-node-ip:6379
export NCCL_IB_DISABLE=1
- Replace
hf-token
with your actual huggingface token. - Replace
/path/to/.hf_config
with the actual path to the Hugging Face cache you previously made. - Replace
head-node-ip
with the IP address of your Ray head node.
Source the environment variables from env.sh
:
source env.sh
The easiest way to do this is to create a Python script which uses NNSight to load a model:
import nnsight
model = nnsight.LanguageModel('{model-checkpoint}' , dispatch=True)
with model.trace('ayy') as tracer:
out = tracer.output.save()
Save the following to download.py
and run python3 download.py
. You can stop the script once the model weights are downloaded. Make sure to replace {model-checkpoint}
with the actual huggingface checkpoint.
Create a script named start.sh
in your working directory with the following content:
#!/bin/bash
HOSTNAME=$(hostname)
source env.sh
resources=`python -m src.ray.resources --name $HOSTNAME`
ray start --resources "$resources" --address $RAY_ADDRESS --block
This will start the model deployment. Using tmux ensures that the deployment continues running in the background, even if your terminal session disconnects.
tmux
conda activate prod
bash start.sh