-
Notifications
You must be signed in to change notification settings - Fork 195
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding files to deploy CodeGen application on AMD GPU (#1130)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
fb514bb
commit 83172e9
Showing
4 changed files
with
410 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Build and deploy CodeGen Application on AMD GPU (ROCm) | ||
|
||
## Build images | ||
|
||
### Build the LLM Docker Image | ||
|
||
```bash | ||
### Cloning repo | ||
git clone https://github.com/opea-project/GenAIComps.git | ||
cd GenAIComps | ||
|
||
### Build Docker image | ||
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . | ||
``` | ||
|
||
### Build the MegaService Docker Image | ||
|
||
```bash | ||
### Cloning repo | ||
git clone https://github.com/opea-project/GenAIExamples | ||
cd GenAIExamples/CodeGen | ||
|
||
### Build Docker image | ||
docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . | ||
``` | ||
|
||
### Build the UI Docker Image | ||
|
||
```bash | ||
cd GenAIExamples/CodeGen/ui | ||
### Build UI Docker image | ||
docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . | ||
|
||
### Build React UI Docker image (React UI allows you to use file uploads) | ||
docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . | ||
``` | ||
|
||
It is recommended to use the React UI as it works for downloading files. The use of React UI is set in the Docker Compose file | ||
|
||
## Deploy CodeGen Application | ||
|
||
### Features of Docker compose for AMD GPUs | ||
|
||
1. Added forwarding of GPU devices to the container TGI service with instructions: | ||
|
||
```yaml | ||
shm_size: 1g | ||
devices: | ||
- /dev/kfd:/dev/kfd | ||
- /dev/dri/:/dev/dri/ | ||
cap_add: | ||
- SYS_PTRACE | ||
group_add: | ||
- video | ||
security_opt: | ||
- seccomp:unconfined | ||
``` | ||
In this case, all GPUs are thrown. To reset a specific GPU, you need to use specific device names cardN and renderN. | ||
For example: | ||
```yaml | ||
shm_size: 1g | ||
devices: | ||
- /dev/kfd:/dev/kfd | ||
- /dev/dri/card0:/dev/dri/card0 | ||
- /dev/dri/render128:/dev/dri/render128 | ||
cap_add: | ||
- SYS_PTRACE | ||
group_add: | ||
- video | ||
security_opt: | ||
- seccomp:unconfined | ||
``` | ||
To find out which GPU device IDs cardN and renderN correspond to the same GPU, use the GPU driver utility | ||
### Go to the directory with the Docker compose file | ||
```bash | ||
cd GenAIExamples/CodeGen/docker_compose/amd/gpu/rocm | ||
``` | ||
|
||
### Set environments | ||
|
||
In the file "GenAIExamples/CodeGen/docker_compose/amd/gpu/rocm/set_env.sh " it is necessary to set the required values. Parameter assignments are specified in the comments for each variable setting command | ||
|
||
```bash | ||
chmod +x set_env.sh | ||
. set_env.sh | ||
``` | ||
|
||
### Run services | ||
|
||
``` | ||
docker compose up -d | ||
``` | ||
|
||
# Validate the MicroServices and MegaService | ||
|
||
## Validate TGI service | ||
|
||
```bash | ||
curl http://${HOST_IP}:${CODEGEN_TGI_SERVICE_PORT}/generate \ | ||
-X POST \ | ||
-d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ | ||
-H 'Content-Type: application/json' | ||
``` | ||
|
||
## Validate LLM service | ||
|
||
```bash | ||
curl http://${HOST_IP}:${CODEGEN_LLM_SERVICE_PORT}/v1/chat/completions\ | ||
-X POST \ | ||
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ | ||
-H 'Content-Type: application/json' | ||
``` | ||
|
||
## Validate MegaService | ||
|
||
```bash | ||
curl http://${HOST_IP}:${CODEGEN_BACKEND_SERVICE_PORT}/v1/codegen -H "Content-Type: application/json" -d '{ | ||
"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." | ||
}' | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
services: | ||
codegen-tgi-service: | ||
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm | ||
container_name: codegen-tgi-service | ||
ports: | ||
- "${CODEGEN_TGI_SERVICE_PORT:-8028}:80" | ||
volumes: | ||
- "/var/lib/GenAI/data:/data" | ||
environment: | ||
no_proxy: ${no_proxy} | ||
http_proxy: ${http_proxy} | ||
https_proxy: ${https_proxy} | ||
HUGGING_FACE_HUB_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN} | ||
HUGGINGFACEHUB_API_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN} | ||
shm_size: 1g | ||
devices: | ||
- /dev/kfd:/dev/kfd | ||
- /dev/dri/:/dev/dri/ | ||
cap_add: | ||
- SYS_PTRACE | ||
group_add: | ||
- video | ||
security_opt: | ||
- seccomp:unconfined | ||
ipc: host | ||
command: --model-id ${CODEGEN_LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048 | ||
codegen-llm-server: | ||
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest} | ||
container_name: codegen-llm-server | ||
depends_on: | ||
- codegen-tgi-service | ||
ports: | ||
- "${CODEGEN_LLM_SERVICE_PORT:-9000}:9000" | ||
ipc: host | ||
environment: | ||
no_proxy: ${no_proxy} | ||
http_proxy: ${http_proxy} | ||
https_proxy: ${https_proxy} | ||
TGI_LLM_ENDPOINT: "http://codegen-tgi-service" | ||
HUGGINGFACEHUB_API_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN} | ||
restart: unless-stopped | ||
codegen-backend-server: | ||
image: ${REGISTRY:-opea}/codegen:${TAG:-latest} | ||
container_name: codegen-backend-server | ||
depends_on: | ||
- codegen-llm-server | ||
ports: | ||
- "${CODEGEN_BACKEND_SERVICE_PORT:-7778}:7778" | ||
environment: | ||
no_proxy: ${no_proxy} | ||
https_proxy: ${https_proxy} | ||
http_proxy: ${http_proxy} | ||
MEGA_SERVICE_HOST_IP: ${CODEGEN_MEGA_SERVICE_HOST_IP} | ||
LLM_SERVICE_HOST_IP: "codegen-llm-server" | ||
ipc: host | ||
restart: always | ||
codegen-ui-server: | ||
image: ${REGISTRY:-opea}/codegen-ui:${TAG:-latest} | ||
container_name: codegen-ui-server | ||
depends_on: | ||
- codegen-backend-server | ||
ports: | ||
- "${CODEGEN_UI_SERVICE_PORT:-5173}:5173" | ||
environment: | ||
no_proxy: ${no_proxy} | ||
https_proxy: ${https_proxy} | ||
http_proxy: ${http_proxy} | ||
BASIC_URL: ${CODEGEN_BACKEND_SERVICE_URL} | ||
BACKEND_SERVICE_ENDPOINT: ${CODEGEN_BACKEND_SERVICE_URL} | ||
ipc: host | ||
restart: always | ||
|
||
networks: | ||
default: | ||
driver: bridge |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#!/usr/bin/env bash | ||
|
||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
### The IP address or domain name of the server on which the application is running | ||
export HOST_IP=direct-supercomputer1.powerml.co | ||
|
||
### The port of the TGI service. On this port, the TGI service will accept connections | ||
export CODEGEN_TGI_SERVICE_PORT=8028 | ||
|
||
### A token for accessing repositories with models | ||
export CODEGEN_HUGGINGFACEHUB_API_TOKEN=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 | ||
|
||
### Model ID | ||
export CODEGEN_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct" | ||
|
||
### The port of the LLM service. On this port, the LLM service will accept connections | ||
export CODEGEN_LLM_SERVICE_PORT=9000 | ||
|
||
### The endpoint of the TGI service to which requests to this service will be sent (formed from previously set variables) | ||
export CODEGEN_TGI_LLM_ENDPOINT="http://${HOST_IP}:${CODEGEN_TGI_SERVICE_PORT}" | ||
|
||
### The IP address or domain name of the server for CodeGen MegaService | ||
export CODEGEN_MEGA_SERVICE_HOST_IP=${HOST_IP} | ||
|
||
### The port for CodeGen backend service | ||
export CODEGEN_BACKEND_SERVICE_PORT=18150 | ||
|
||
### The URL of CodeGen backend service, used by the frontend service | ||
export CODEGEN_BACKEND_SERVICE_URL="http://${HOST_IP}:${CODEGEN_BACKEND_SERVICE_PORT}/v1/codegen" | ||
|
||
### The endpoint of the LLM service to which requests to this service will be sent | ||
export CODEGEN_LLM_SERVICE_HOST_IP=${HOST_IP} | ||
|
||
### The CodeGen service UI port | ||
export CODEGEN_UI_SERVICE_PORT=18151 |
Oops, something went wrong.