The code in this repo is deprecated. It contains the code for the old Node.js based Roboflow Inference server powered by roboflow.js and TFjs-node.
The new Roboflow Inference Server is open-source, faster, supports more models, works on more devices, has more features, is under active development, and is better in every way.
You should use that instead.
Old Readme Content
# Roboflow Edge Inference ServerThe Roboflow Edge Inference Server is an on-device implementation of our hosted object detection inference API. It lets you run your custom-trained Roboflow Train models on-device which means you can run them in situations where bandwidth is limited or production images cannot be processed by a third party.
You pull
and run
our roboflow/inference-server
Docker container
and the Inference Server will become available on port 9001
.
Your model is downloaded the first time you invoke it and inference is done on your device (with hardware acceleration where applicable) via an HTTP interface; your images and model predictions never leave the device.
We have currently launched support for the NVIDIA Jetson line of devices (including the Jetson Nano 2GB, Jetson Nano 4GB, and Jetson Xavier NX). We recommend running the latest version of NVIDIA JetPack (4.5.1).
Support for CPU inference and arbitrary CUDA GPUs is a work in progress and will be officially supported soon. Reach out if you would like early access.
For most use-cases, the Hosted Inference API is preferable. It requires no setup or maintenance and automatically handles autoscaling up and down to handle any amount of load (even Hacker News and Reddit front page traffic are no match for it) and in almost all cases has a lower total cost.
There are two primary use-cases where our Hosted API is needed:
- When bandwidth is constrained or an Internet connection is unreliable (eg for autonomous vehicles).
- When production images cannot be processed by a third party (eg for privacy or security reasons).
You will need:
- A custom model trained with Roboflow Train,
- A Roboflow Pro account,
- A supported device with a network connection (~8MB of data will be used to download your model weights).
Currently, the server downloads weights over the network each time it starts up; this means it cannot yet be used in fully-offline situations. We are working on supporting offline and air-gapped mode soon. Reach out if you would like early access.
Pull down the inference-server
Docker container
built for your device; for NVIDIA Jetsons, this is:
sudo docker pull roboflow/inference-server:jetson
Then run the Docker container with your GPU and network interface:
sudo docker run --net=host --gpus all roboflow/inference-server:jetson
After docker run
is invoked, the server will be running on port 9001
. You
can get predictions from it using the same code as
with our Hosted API
(replacing references to infer.roboflow.com
with localhost:9001
or your
Jetson's local IP address).
base64 YOUR_IMAGE.jpg | curl -d @- \
"http://localhost:9001/xx-your-model--1?access_token=YOUR_KEY"
To read more about the Roboflow Inference Server, performance expectations, and speed optimization tips, read the full documentation. And for code snippets in your preferred language, see the Roboflow Infer API documentation.
Roboflow is the easiest way to turn your images into actionable information. We provide all the tools you need to get started building computer vision into your applications all the way from annotation to deployment.
Get started with a free account and you'll have a working model tailored to your specific use-case in an afternoon.