feat:update model loader #178

qinguoyi · 2024-09-28T03:11:59Z

What this PR does / why we need it

#163 (comment)

Which issue(s) this PR fixes

None

Special notes for your reviewer

this pr, it mainly includes things：

add allow_patterns and ignore_patterns, and adapted with filename
clean up python code, put some hard code in constant file
when download models, only use snapshot_download, with allow_patterns we can download one or more files.
there is not consider the chunked model weights and the complete weights, it will be push with another pr

Does this PR introduce a user-facing change?

Support specify or ignore multiple file downloads

kerthcet · 2024-09-29T02:13:20Z

Will review this ASAP. Thanks for the work.

kerthcet · 2024-09-29T02:13:27Z

/kind feature

kerthcet

We may also need an e2e test to make sure the pattern works as expected, but it can be a follow up until I publish a official image tag for model-builder.

kerthcet · 2024-09-29T02:45:40Z

api/core/v1alpha1/model_types.go

@@ -49,6 +49,12 @@ type ModelHub struct {
 	// +kubebuilder:default=main
 	// +optional
 	Revision *string `json:"revision,omitempty"`
+	// AllowPatterns refers to only files matching at least one pattern are downloaded.
+	// +optional
+	AllowPatterns *string `json:"allowPatterns,omitempty"`


Let's make it a slice because huggingface_hub accepted list.

kerthcet · 2024-09-29T02:45:45Z

api/core/v1alpha1/model_types.go

+	AllowPatterns *string `json:"allowPatterns,omitempty"`
+	// IgnorePatterns refers to files matching any of the patterns are not downloaded.
+	// +optional
+	IgnorePatterns *string `json:"ignorePatterns,omitempty"`


llmaz/model_loader/constant.py

kerthcet · 2024-09-29T03:01:40Z

llmaz/model_loader/model_hub/huggingface.py

-                        revision=revision,
-                    ).add_done_callback(handle_completion)
-                )
+        if filename:


Let's be more simple here, if filename, the allow_patterns will be the filename. Actually, In OP is not that accurate, we may have pattern like *.json.

And we should add a validation in the webhook as Once filename is set, the both patterns should be nil.

kerthcet · 2024-09-29T03:05:24Z

llmaz/model_loader/model_hub/modelscope.py

-                    revision=revision,
-                ).add_done_callback(handle_completion)
-            )
+        if filename:


The same here.

kerthcet · 2024-09-29T03:20:38Z

pkg/controller_helper/model_source/modelsource.go

 	coreapi "github.com/inftyai/llmaz/api/core/v1alpha1"
 	"github.com/inftyai/llmaz/pkg/util"
+	corev1 "k8s.io/api/core/v1"
+	"strings"


Let's make sure the libs is grouped like this:

important ( # the base libs golang provided fmt strings # the third part libs corev1 "k8s.io/api/core/v1" # project libs coreapi "github.com/inftyai/llmaz/api/core/v1alpha1" )

kerthcet · 2024-09-29T03:22:54Z

pkg/controller_helper/model_source/modelsource.go

@@ -59,12 +60,28 @@ type ModelSourceProvider interface {

 func NewModelSourceProvider(model *coreapi.OpenModel) ModelSourceProvider {
 	if model.Spec.Source.ModelHub != nil {
+
+		// fileName should not in ignorePatterns
+		if model.Spec.Source.ModelHub.Filename != nil && model.Spec.Source.ModelHub.IgnorePatterns != nil {


can we verify this in webhooks? As I mentioned, once filename is set, the other patterns should not be configured anyway. It's useless.

kerthcet · 2024-09-29T03:24:16Z

pkg/controller_helper/model_source/modelhub.go

+	}
+	if p.modelIgnorePatterns != nil {
+		initContainer.Env = append(initContainer.Env,
+			corev1.EnvVar{Name: "MODEL_IGNORE_PATTERNS", Value: *p.modelIgnorePatterns},


Doesn't this a String value, but we want a Optional[List[str]] in python side, will this work?

I just want to highlight that we hardcoded the loader image name in pkg/defaults.go, so what we changed in python side will only be tested part of them:

what we have tested is the refactors in controller side is correct

what we haven't tested is the new added patterns

So you may have to do the extra steps to make sure the code is working:

make loader-image-load -> to build the right loader image

replace the image name in pkg/defaults.go

run the e2e tests(you may need to run kind load image to load the image since you didn't push to the remote registry)

I'm sorry for the trivial steps, but I have't found better ways right now.

qinguoyi · 2024-09-29T06:14:09Z

We may also need an e2e test to make sure the pattern works as expected, but it can be a follow up until I publish a official image tag for model-builder.

Tks for your review, Please hold on this.
I will retest the complete process, probably after National Day of the People's Republic of China.

kerthcet · 2024-10-18T02:51:57Z

kind ping @qinguoyi would like to publish a new release, and happy to include this feature? Do you still work on this?

qinguoyi · 2024-10-18T02:57:58Z

kind ping @qinguoyi would like to publish a new release, and happy to include this feature? Do you still work on this?

yes, i will finish this work util 10.20

qinguoyi · 2024-10-18T13:21:30Z

hi, i have finish this work. PTAL. @kerthcet

------- What changes: ---------

The allowPatterns and ignorePatterns field are changed to slice.
The patterns fields configured in yaml is connected into a string with ',' and put into the environment variable.
After the model-loader obtains it, it is separated by ',' to obtain the list.
Add a validation in the webhook . Once the file name field is set, the allowPatterns and ignorePatterns fields must be nil.
If the file name is set, in the model-loader, the allow_patterns will be the filename.
add some ut to test pattern fields.

------- What tests: -----------

llama.cpp infers with huggingface model

specifying filename, this scene is normal.

llama.cpp infer needs specifying the file path not file dir, so test with specifying allowpattern and ignorepattern, we only see what file download, this scene is normal.

apiVersion: llmaz.io/v1alpha1
kind: OpenModel
metadata:
  name: qwen2-0--5b-gguf
spec:
  familyName: qwen2
  source:
    modelHub:
      modelID: Qwen/Qwen2-0.5B-Instruct-GGUF
      allowPatterns:
        - "*"
      ignorePatterns:
        - "*.gguf"

specifying filename and allowpattern or ignorepattern, there will be wrong, this scene is normal.

kerthcet · 2024-10-21T02:33:02Z

I will take a look today.

kerthcet · 2024-10-23T03:25:18Z

On the way now, sorry for the late response, I was busy with other things yesterday.

kerthcet

Thanks @qinguoyi this is awesome!

I just left some nits, others are great to me. And once we merged this PR, I'll update the model-loader official image

kerthcet · 2024-10-23T03:36:32Z

pkg/webhook/openmodel_webhook.go

+		if model.Spec.Source.ModelHub.AllowPatterns != nil && len(model.Spec.Source.ModelHub.AllowPatterns) != 0 {
+			allErrs = append(allErrs, field.Invalid(sourcePath.Child("modelHub.allowPatterns"), model.Spec.Source.ModelHub.AllowPatterns, "Once Filename is set, allowPatterns should be nil"))
+		}
+		if model.Spec.Source.ModelHub.IgnorePatterns != nil && len(model.Spec.Source.ModelHub.IgnorePatterns) != 0 {


kerthcet · 2024-10-23T03:39:52Z

test/integration/webhook/model_test.go

 			},
 			failed: false,
 		}),
+		ginkgo.Entry("set filename and allowPatterns when modelHub is Huggingface", &testValidatingCase{


Let's add one more case as both allowPatterns and ignorePatterns are set.

kerthcet · 2024-10-23T04:05:40Z

api/core/v1alpha1/model_types.go

@@ -49,6 +49,12 @@ type ModelHub struct {
 	// +kubebuilder:default=main
 	// +optional
 	Revision *string `json:"revision,omitempty"`
+	// AllowPatterns refers to only files matching at least one pattern are downloaded.


Nit:

Suggested change

// AllowPatterns refers to only files matching at least one pattern are downloaded.

// AllowPatterns refers to files matched with at least one pattern will be downloaded.

kerthcet · 2024-10-23T04:06:14Z

api/core/v1alpha1/model_types.go

+	// AllowPatterns refers to only files matching at least one pattern are downloaded.
+	// +optional
+	AllowPatterns []string `json:"allowPatterns,omitempty"`
+	// IgnorePatterns refers to files matching any of the patterns are not downloaded.


Suggested change

// IgnorePatterns refers to files matching any of the patterns are not downloaded.

// IgnorePatterns refers to files matched with any of the patterns will not be downloaded.

kerthcet · 2024-10-23T05:12:25Z

pkg/webhook/openmodel_webhook.go

@@ -118,5 +118,15 @@ func (w *OpenModelWebhook) generateValidate(obj runtime.Object) field.ErrorList
 			allErrs = append(allErrs, field.Invalid(sourcePath.Child("modelHub.filename"), *model.Spec.Source.ModelHub.Filename, "Filename can only set once modeHub is Huggingface"))
 		}
 	}
+
+	if model.Spec.Source.ModelHub != nil && model.Spec.Source.ModelHub.Filename != nil {
+		if model.Spec.Source.ModelHub.AllowPatterns != nil && len(model.Spec.Source.ModelHub.AllowPatterns) != 0 {


Suggested change

if model.Spec.Source.ModelHub.AllowPatterns != nil && len(model.Spec.Source.ModelHub.AllowPatterns) != 0 {

if model.Spec.Source.ModelHub.AllowPatterns != nil {

kerthcet · 2024-10-23T05:14:56Z

api/core/v1alpha1/model_types.go

@@ -49,6 +49,12 @@ type ModelHub struct {
 	// +kubebuilder:default=main
 	// +optional
 	Revision *string `json:"revision,omitempty"`


Let's update the comment of Filename as well,

// Filename refers to a specified model file rather than the whole repo. // This is helpful to download a specified GGUF model rather than downloading // the whole repo which includes all kinds of quantized models. // TODO: this is only supported with Huggingface, add support for ModelScope // in the near future. // Note: once filename is set, allowPatterns and ignorePatterns should be left unset.

kerthcet · 2024-10-23T05:19:08Z

I just tested with the model yaml:

apiVersion: llmaz.io/v1alpha1
kind: OpenModel
metadata:
  name: qwen2-0--5b-gguf
spec:
  familyName: qwen2
  source:
    modelHub:
      modelID: Qwen/Qwen2-0.5B-Instruct-GGUF
      # filename: qwen2-0_5b-instruct-q5_k_m.gguf
      allowPatterns:
      - qwen2-0_5b-instruct-q5_k_m.gguf

It failed just because we didn't specify the model file name, are you the same. I'm ok with that because we suggest people to use the filename rather than the patterns when only one file is needed.

The error looks like:

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/workspace/models/models--Qwen--Qwen2-0.5B-Instruct-GGUF'
INFO [                    main] HTTP server is listening | tid="139916821146176" timestamp=1729659912 n_threads_http="7" port="8080" hostname="0.0.0.0"
INFO [                    main] loading model | tid="139916821146176" timestamp=1729659912 n_threads_http="7" port="8080" hostname="0.0.0.0"
 ERR [              load_model] unable to load model | tid="139916821146176" timestamp=1729659912 model="/workspace/models/models--Qwen--Qwen2-0.5B-Instruct-GGUF"
 ERR [                    main] exiting due to model loading error | tid="139916821146176" timestamp=1729659912

qinguoyi · 2024-10-23T06:03:50Z

I just tested with the model yaml:

apiVersion: llmaz.io/v1alpha1
kind: OpenModel
metadata:
  name: qwen2-0--5b-gguf
spec:
  familyName: qwen2
  source:
    modelHub:
      modelID: Qwen/Qwen2-0.5B-Instruct-GGUF
      # filename: qwen2-0_5b-instruct-q5_k_m.gguf
      allowPatterns:
      - qwen2-0_5b-instruct-q5_k_m.gguf

It failed just because we didn't specify the model file name, are you the same. I'm ok with that because we suggest people to use the filename rather than the patterns when only one file is needed.

The error looks like:

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/workspace/models/models--Qwen--Qwen2-0.5B-Instruct-GGUF'
INFO [                    main] HTTP server is listening | tid="139916821146176" timestamp=1729659912 n_threads_http="7" port="8080" hostname="0.0.0.0"
INFO [                    main] loading model | tid="139916821146176" timestamp=1729659912 n_threads_http="7" port="8080" hostname="0.0.0.0"
 ERR [              load_model] unable to load model | tid="139916821146176" timestamp=1729659912 model="/workspace/models/models--Qwen--Qwen2-0.5B-Instruct-GGUF"
 ERR [                    main] exiting due to model loading error | tid="139916821146176" timestamp=1729659912

yes, i am same with you.

with the llama.cp to infer, there needs to specify the filename path not the directory path. so, if we specify the dir path,there will be wrong.

another, let we can see, when filename is not set, the model path will be the directory. when i develop, i have already find this , but i have no good idea to fit it.

how do you think about it ?

llmaz/pkg/controller_helper/model_source/modelhub.go

Lines 43 to 57 in da979a1

    
           // Example 1: 
        
           //   - modelID: facebook/opt-125m 
        
           //     modelPath: /workspace/models/models--facebook--opt-125m 
        
           // 
        
           // Example 2: 
        
           //   - modelID: Qwen/Qwen2-0.5B-Instruct-GGUF 
        
           //     fileName: qwen2-0_5b-instruct-q5_k_m.gguf 
        
           //     modelPath: /workspace/models/qwen2-0_5b-instruct-q5_k_m.gguf 
        
           func (p *ModelHubProvider) ModelPath() string { 
        
           	if p.fileName != nil { 
        
           		return CONTAINER_MODEL_PATH + *p.fileName 
        
           	} 
        
           	return CONTAINER_MODEL_PATH + "models--" + strings.ReplaceAll(p.modelID, "/", "--") 
        
           }

kerthcet · 2024-10-23T06:11:53Z

Let's be simple at first and we can evolve in the future, so the idea is:

If people wants to deploy with one file, use the filename and leave the patters empty, we'll combine the repoName + fileName
If people wants to use the patters, let's load the whole repo in the runtime

Just as we do today.

kerthcet · 2024-10-23T06:15:23Z

Some kind tips, let's address the comments with new commits, then the reviewers can find the append changes and understand what's going on there, we can squash the commits in the last minute.

qinguoyi · 2024-10-23T06:22:33Z

Some kind tips, let's address the comments with new commits, then the reviewers can find the append changes and understand what's going on there, we can squash the commits in the last minute.

thanks for your kind tips, i have revert the square commits. PTAL one more.

qinguoyi · 2024-10-23T06:24:49Z

Let's be simple at first and we can evolve in the future, so the idea is:

If people wants to deploy with one file, use the filename and leave the patters empty, we'll combine the repoName + fileName

If people wants to use the patters, let's load the whole repo in the runtime

Just as we do today.

So, the current code does not need to be changed.

In addition, we need to give user-friendly tips in the documentation to use patterns fields.

kerthcet · 2024-10-23T06:41:21Z

test/integration/webhook/model_test.go

@@ -147,6 +147,12 @@ var _ = ginkgo.Describe("model default and validation", func() {
 			},
 			failed: true,
 		}),
+		ginkgo.Entry("set filename, allowPatterns and ignorePatterns when modelHub is Huggingface", &testValidatingCase{


What I mean is adding a model repo, and allowPatters and ignorePatterns both set but the filename is empty, to make sure we will not failed.

kerthcet · 2024-10-23T06:43:26Z

Squash as well, I'm LGTM with the other parts.

kerthcet · 2024-10-23T07:07:54Z

/lgtm
/approve

Great work!

InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 28, 2024

qinguoyi mentioned this pull request Sep 28, 2024

[ModelLoader] Some huggingface models may contain duplicated weights #163

Closed

3 tasks

InftyAI-Agent added feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Sep 29, 2024

kerthcet reviewed Sep 29, 2024

View reviewed changes

qinguoyi force-pushed the feat-update-modelloader branch 2 times, most recently from ff79237 to 10bee4c Compare October 18, 2024 08:38

qinguoyi force-pushed the feat-update-modelloader branch 4 times, most recently from b4acd87 to 380f7f7 Compare October 20, 2024 14:51

kerthcet reviewed Oct 23, 2024

View reviewed changes

qinguoyi force-pushed the feat-update-modelloader branch from 62f0ace to d72333e Compare October 23, 2024 06:09

qinguoyi force-pushed the feat-update-modelloader branch from d72333e to 04eaa65 Compare October 23, 2024 06:20

kerthcet reviewed Oct 23, 2024

View reviewed changes

feat:update model loader

a64129c

qinguoyi force-pushed the feat-update-modelloader branch from 7a3d9eb to a64129c Compare October 23, 2024 06:53

InftyAI-Agent added lgtm Looks good to me, indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 23, 2024

InftyAI-Agent assigned kerthcet Oct 23, 2024

InftyAI-Agent merged commit bd8e398 into InftyAI:main Oct 23, 2024
18 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat:update model loader #178

feat:update model loader #178

qinguoyi commented Sep 28, 2024 •

edited

Loading

kerthcet commented Sep 29, 2024

kerthcet commented Sep 29, 2024

kerthcet left a comment

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

kerthcet Sep 29, 2024

qinguoyi commented Sep 29, 2024

kerthcet commented Oct 18, 2024

qinguoyi commented Oct 18, 2024

qinguoyi commented Oct 18, 2024 •

edited

Loading

kerthcet commented Oct 21, 2024

kerthcet commented Oct 23, 2024

kerthcet left a comment

kerthcet Oct 23, 2024

kerthcet Oct 23, 2024

kerthcet Oct 23, 2024

kerthcet Oct 23, 2024

kerthcet Oct 23, 2024

kerthcet Oct 23, 2024

kerthcet commented Oct 23, 2024

qinguoyi commented Oct 23, 2024

kerthcet commented Oct 23, 2024

kerthcet commented Oct 23, 2024

qinguoyi commented Oct 23, 2024

qinguoyi commented Oct 23, 2024 •

edited

Loading

kerthcet Oct 23, 2024

kerthcet commented Oct 23, 2024

kerthcet commented Oct 23, 2024

	// AllowPatterns refers to only files matching at least one pattern are downloaded.
	// AllowPatterns refers to files matched with at least one pattern will be downloaded.

	// IgnorePatterns refers to files matching any of the patterns are not downloaded.
	// IgnorePatterns refers to files matched with any of the patterns will not be downloaded.

	if model.Spec.Source.ModelHub.AllowPatterns != nil && len(model.Spec.Source.ModelHub.AllowPatterns) != 0 {
	if model.Spec.Source.ModelHub.AllowPatterns != nil {

feat:update model loader #178

feat:update model loader #178

Conversation

qinguoyi commented Sep 28, 2024 • edited Loading

What this PR does / why we need it

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

kerthcet commented Sep 29, 2024

kerthcet commented Sep 29, 2024

kerthcet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qinguoyi commented Sep 29, 2024

kerthcet commented Oct 18, 2024

qinguoyi commented Oct 18, 2024

qinguoyi commented Oct 18, 2024 • edited Loading

kerthcet commented Oct 21, 2024

kerthcet commented Oct 23, 2024

kerthcet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kerthcet commented Oct 23, 2024

qinguoyi commented Oct 23, 2024

kerthcet commented Oct 23, 2024

kerthcet commented Oct 23, 2024

qinguoyi commented Oct 23, 2024

qinguoyi commented Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

kerthcet commented Oct 23, 2024

kerthcet commented Oct 23, 2024

qinguoyi commented Sep 28, 2024 •

edited

Loading

qinguoyi commented Oct 18, 2024 •

edited

Loading

qinguoyi commented Oct 23, 2024 •

edited

Loading