Skip to content

Commit

Permalink
Make command optional
Browse files Browse the repository at this point in the history
Signed-off-by: kerthcet <kerthcet@gmail.com>
  • Loading branch information
kerthcet committed Oct 9, 2024
1 parent a507bdf commit 0e8402b
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 8 deletions.
3 changes: 2 additions & 1 deletion api/inference/v1alpha1/backendruntime_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ type BackendRuntimeArg struct {
// BackendRuntimeSpec defines the desired state of BackendRuntime
type BackendRuntimeSpec struct {
// Commands represents the default command of the backendRuntime.
Commands []string `json:"commands"`
// +optional
Commands []string `json:"commands,omitempty"`
// Image represents the default image registry of the backendRuntime.
// It will work together with version to make up a real image.
Image string `json:"image"`
Expand Down
1 change: 0 additions & 1 deletion config/crd/bases/inference.llmaz.io_backendruntimes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,6 @@ spec:
It will be appended to the image as a tag.
type: string
required:
- commands
- image
- resources
- version
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/sglang/playground.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
apiVersion: inference.llmaz.io/v1alpha1
kind: Playground
metadata:
name: qwen2-05b
name: qwen2-0--5b
spec:
replicas: 1
modelClaim:
modelName: qwen2-05b
modelName: qwen2-0--5b
backendRuntimeConfig:
name: sglang
8 changes: 4 additions & 4 deletions docs/support-backends.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# All Kinds of Supported Inference Backends

## vLLM
## llama.cpp

[vLLM](https://github.com/vllm-project/vllm) is a high-throughput and memory-efficient inference and serving engine for LLMs
[llama.cpp](https://github.com/ggerganov/llama.cpp) is to enable LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud.

## SGLang

[SGLang](https://github.com/sgl-project/sglang) is yet another fast serving framework for large language models and vision language models.

## llama.cpp
## vLLM

[llama.cpp](https://github.com/ggerganov/llama.cpp) is to enable LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud.
[vLLM](https://github.com/vllm-project/vllm) is a high-throughput and memory-efficient inference and serving engine for LLMs

0 comments on commit 0e8402b

Please sign in to comment.