Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Seldon ROCKs to 1.17.1 version for CKF release 1.8 #37

Closed
orfeas-k opened this issue Aug 25, 2023 · 14 comments
Closed

Update Seldon ROCKs to 1.17.1 version for CKF release 1.8 #37

orfeas-k opened this issue Aug 25, 2023 · 14 comments
Labels
enhancement New feature or request Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release

Comments

@orfeas-k
Copy link
Contributor

This issue tracks the process of updating Seldon ROCKs to Seldon's 1.17 for CKF release 1.8. For the process, we 're following our internal Kubeflow ROCK Images Best Practices that has a section about Upgrade of ROCK Images.

The changes that this process will introduce should match what the upstream has for version release 1.17.1.

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Aug 25, 2023

mlserver-*

Regarding MLServer ROCKs, as we see in release 1.17.0, its version was bumped to 1.3.5. Thus, for introducing changes to ROCKs, we compared their previous versions (1.2.4 for huggingface, 1.2.0 for the rest) Dockerfile to the new one. For this, I did a

diff Dockerfile1.2.x Dockerfile1.3.5 >> changes.diff

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Aug 25, 2023

tensorflow-serving

Regarding tensorflow-serving ROCK, we see that release 1.17.1 still uses imageVersion 2.1.0 so we don't need to introduce any changes to this ROCK.

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 8, 2023

sklearnserver

Regarding sklearn-server ROCK, while investigating with @i-chvets, we realised that current rockcraft.yaml file is based on upstream Dockerfile.conda and doesn't take into account the Dockerfile. Looking at the upstream Makefile though, we see that in order to build the image, they use both (with Dockerfile.conda as a BASE_IMAGE). In the context of update, we will update the ROCK based on current implementation and will file an issue in order to implement the other Dockerfile into our ROCK.

EDIT: This has been resolved and the rockcraft.yaml updated. Plus, we documented its implementation here #47.

@orfeas-k orfeas-k added the Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release label Sep 8, 2023
@orfeas-k orfeas-k added the enhancement New feature or request label Sep 8, 2023
@i-chvets
Copy link
Contributor

This also needs to be addressed, even though it is not really a ROCK work, but related to 1.8: canonical/seldon-core-operator#200

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 11, 2023

tox.ini files

Regarding tox environments, we updated the pytest commands according to updates in the seldon-core-operator tox.ini file which means that we used:

  • -e seldon-servers-integration for servers. We excluded tensorflow-serving since atm, it won't build in GH runner (issue ) and MLServer which is a Seldon ROCK but not used in Seldon.
  • -e charm-integration for seldon-core-operator ROCK

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 12, 2023

Tensorflow-serving

We noticed that this could be the reason that when we introduced this ROCK, the first tensorflow-serving version that was working was 2.13.0 (going up from 2.1.0).

@i-chvets
Copy link
Contributor

i-chvets commented Sep 13, 2023

Seldon-core-operator

@orfeas-k Here are some of my findings.
After some debugging it looks like Seldon Core Operator tests are not passing with upstream container image either docker.io/seldonio/seldon-core-operator:1.17.1. Symptoms are the same as we are testing with our ROCK.
There was some interesting errors in seldon-core container when reconciling services. There was extra label added and Annotations were different (see lines with + sign in below log):

2023-09-12T23:01:22.886Z [seldon-core] {"level":"info","ts":1694559682.8864827,"logger":"controllers.SeldonDeployment","msg":
"Difference in SVCs:
    &v1.Service{
        TypeMeta:
        {
            Kind: "Service",
            APIVersion: "v1"
        },
        ObjectMeta: v1.ObjectMeta{
            ... // 8 identical fields
            DeletionTimestamp: nil,
            DeletionGracePeriodSeconds: nil,
            Labels: map[string]string{
                + "app.juju.is/created-by": "seldon-controller-manager",
                "app.kubernetes.io/managed-by": "seldon-core",
                "seldon-app": "seldon-model-example",
                "seldon-deployment-id": "seldon-model",
            },
            Annotations: map[string]string{},
            + Annotations: nil,
            OwnerReferences:
            {{APIVersion: "machinelearning.seldon.io/v1", Kind: "SeldonDeployment", Name: "seldon-model", UID: "f7969f87-53c3-48da-8e2a-a7589c8e32fd", ...}},
            Finalizers: nil,
            ... // 2 identical fields
        },
        Spec:   {Ports: {{Name: "http", Protocol: "TCP", Port: 8000, TargetPort: {IntVal: 8000}, ...}, {Name: "grpc", Protocol: "TCP", Port: 5001, TargetPort: {IntVal: 5001}, ...}}, Selector: {"seldon-app": "seldon-model-example"}, ClusterIP: "10.152.183.54", ClusterIPs: {"10.152.183.54"}, ...},\n \tStatus: {},\n }\n",
"SeldonDeployment":"test-charm-g1f1/seldon-model"}

These are the errors seen on SeldonDeployment, while tests are running (all pods are up):

2023-09-12T23:01:22.903Z [seldon-core] {"level":"info","ts":1694559682.9033585,"logger":"controllers.SeldonDeployment","msg":"Inference status","SeldonDeployment":"test-charm-g1f1/seldon-model","status":{"state":"Creating","address":{"url":"http://seldon-model-example.test-charm-g1f1.svc.cluster.local:8000/api/v1.0/predictions"},"conditions":
    [
        {
            "type":"HpasReady",
            "status":"True",
            "lastTransitionTime":"2023-09-12T23:00:34Z",
            "reason":"No HPAs defined"
        },
        {
            "type":"KedaReady",
            "status":"True",
            "lastTransitionTime":"2023-09-12T23:00:34Z","
            reason":"No KEDA resources defined"
        },
        {
            "type":"PdbsReady",
            "status":"True",
            "lastTransitionTime":"2023-09-12T23:00:34Z",
            "reason":"No PDBs defined"
        },
        {
            "type":"Ready",
            "status":"False",
            "lastTransitionTime":"2023-09-12T23:00:34Z",
            "reason":"Not all services created"
        },
        {
            "type":"ServicesReady",
            "status":"False",
            "lastTransitionTime":"2023-09-12T23:00:34Z",
            "reason":"Not all services created"
        }
    ]
}}

More investigation is required into why version 1.17.1 fails to complete deployment of SeldonDeployments.

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 13, 2023

seldon-core-operator

Great job @i-chvets. These findings shed a bit of light in the situation. Let's note also that the only change introduced in this update is the source-tag. Everything else remained the same. Also, tests using the ROCK without the source-tag change succeed.

Regarding logs, we 've also seen this in the conditions of resources `seldondeployment'

  - lastTransitionTime: "2023-09-12T22:47:38Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Ready
  - lastTransitionTime: "2023-09-12T22:45:56Z"
    reason: Not all services created
    status: "False"
    type: ServicesReady

which is similar but not exactly the same with the above.

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 13, 2023

EDIT: Doesn't stand for final PR that updates sklearnserver ROCK

sklearnserver

Looks like this is the case also for seldon-servers-integration tests using update sklearnserver ROCK as well. test_seldon_predictor_server failed with these parameters [MLFLOW_SERVER-mlflowserver.yaml-api/v1.0/predictions-request_data4-response_test_data4]. Looking at deployment mlflow-default-0-classifier, I see the same conditions here as well

status:
  conditions:
  - lastTransitionTime: "2023-09-13T10:14:20Z"
    lastUpdateTime: "2023-09-13T10:14:20Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2023-09-13T10:14:20Z"
    lastUpdateTime: "2023-09-13T10:14:20Z"
    message: ReplicaSet "mlflow-default-0-classifier-665d987847" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing

After this, all next runs of the tests (with different parameters) fail.

@orfeas-k
Copy link
Contributor Author

orfeas-k commented Sep 13, 2023

mlserver-sklearn

Trying to run seldon-servers-integration tests on mlserver-sklearn updated ROCK, run test_seldon_predictor_server[SKLEARN_SERVER-sklearn-v2.yaml-v2/models/classifier/infer-request_data1-response_test_data1] failed with the seldondeployment going to Failed

INFO     test_seldon_servers:utils.py:29 seldondeployment/sklearn status == Creating (waiting for 'Available')
INFO     httpx:_client.py:1013 HTTP Request: GET https://172.31.31.120:16443/apis/machinelearning.seldon.io/v1/namespaces/test-seldon-servers-48ip/seldondeployments/sklearn "HTTP/1.1 200 OK"
INFO     test_seldon_servers:utils.py:25 Deployment of fseldondeployment/sklearn failed, status = Failed

It looks like the corresponding seldondeployment pod (sklearn-default-0-classifier-5b85bb86b5-f5nmm ) fails to start due to health checks showing these logs

ubuntu@ip-172-31-31-120:~$ kubectl -n test-seldon-servers-ooek logs sklearn-default-0-classifier-5b85bb86b5-f5nmm --all-containers
  {"level":"error","ts":1694607015.8338234,"logger":"SeldonRestApi","msg":"Ready check failed","error":"dial tcp [::1]:9000: connect: connection refused","stacktrace":"net/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.handleCORSRequests.func1\n\t/workspace/api/rest/middlewares.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/gorilla/mux.CORSMethodMiddleware.func1.1\n\t/go/pkg/mod/github.com/gorilla/mux@v1.8.0/middleware.go:51\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.xssMiddleware.func1\n\t/workspace/api/rest/middlewares.go:87\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.(*CloudeventHeaderMiddleware).Middleware.func1\n\t/workspace/api/rest/middlewares.go:47\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.puidHeader.func1\n\t/workspace/api/rest/middlewares.go:79\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2879\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1930"}

Describing the same pod, we get those events

ubuntu@ip-172-31-31-120:~$ k -n test-seldon-servers-ooek describe pod sklearn-default-0-classifier-5b85bb86b5-f5nmm 

Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  3m52s                  default-scheduler  Successfully assigned test-seldon-servers-ooek/sklearn-default-0-classifier-5b85bb86b5-f5nmm to ip-172-31-31-120
  Normal   Pulled     3m52s                  kubelet            Container image "seldonio/rclone-storage-initializer:1.14.1" already present on machine
  Normal   Created    3m52s                  kubelet            Created container classifier-model-initializer
  Normal   Started    3m52s                  kubelet            Started container classifier-model-initializer
  Normal   Pulled     3m51s                  kubelet            Container image "mlserver-sklearn_1.3.5_20.04_1_amd64:1.3.5_20.04_1" already present on machine
  Normal   Created    3m51s                  kubelet            Created container classifier
  Normal   Started    3m51s                  kubelet            Started container classifier
  Normal   Pulled     3m51s                  kubelet            Container image "docker.io/seldonio/seldon-core-executor:1.14.0" already present on machine
  Normal   Created    3m51s                  kubelet            Created container seldon-container-engine
  Normal   Started    3m51s                  kubelet            Started container seldon-container-engine
  Warning  Unhealthy  2m53s (x8 over 3m28s)  kubelet            Readiness probe failed: Get "http://10.1.63.206:9000/v2/health/ready": dial tcp 10.1.63.206:9000: connect: connection refused
  Warning  Unhealthy  2m53s (x8 over 3m28s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503

As expected, in pod's conditions we see:

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-09-13T12:09:52Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-09-13T12:09:50Z"
    message: 'containers with unready status: [classifier seldon-container-engine]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-09-13T12:09:50Z"
    message: 'containers with unready status: [classifier seldon-container-engine]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-09-13T12:09:50Z"
    status: "True"
    type: PodScheduled

and in the seldondeployment's sklearn

status:
  address:
    url: http://sklearn-default.test-seldon-servers-ooek.svc.cluster.local:8000/v2/models/classifier/infer
  conditions:
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: DeploymentsReady
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    reason: No HPAs defined
    status: "True"
    type: HpasReady
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    reason: No KEDA resources defined
    status: "True"
    type: KedaReady
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    reason: No PDBs defined
    status: "True"
    type: PdbsReady
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Ready
  - lastTransitionTime: "2023-09-13T12:09:50Z"
    reason: Not all services created
    status: "False"
    type: ServicesReady

Next runs also failed (e.g. the run test_seldon_predictor_server[MLFLOW_SERVER-mlflowserver.yaml-api/v1.0/predictions-request_data4-response_test_data4] presented the same errors as in the comment above) and also having this event

Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  3m50s (x49 over 8m20s)  kubelet  Readiness probe failed: dial tcp 10.1.63.209:9000: connect: connection refused

and logging

{"level":"error","ts":1694608626.6623445,"logger":"SeldonRestApi","msg":"Ready check failed","error":"dial tcp [::1]:9000: connect: connection refused","stacktrace":"net/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.handleCORSRequests.func1\n\t/workspace/api/rest/middlewares.go:64\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/gorilla/mux.CORSMethodMiddleware.func1.1\n\t/go/pkg/mod/github.com/gorilla/mux@v1.8.0/middleware.go:51\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.xssMiddleware.func1\n\t/workspace/api/rest/middlewares.go:87\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.(*CloudeventHeaderMiddleware).Middleware.func1\n\t/workspace/api/rest/middlewares.go:47\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/seldonio/seldon-core/executor/api/rest.puidHeader.func1\n\t/workspace/api/rest/middlewares.go:79\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2047\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2879\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1930"}

@orfeas-k orfeas-k changed the title Update Seldon ROCKs to 1.17.1 version for 1.8 release Update Seldon ROCKs to 1.17.1 version for CKF release 1.8 Sep 21, 2023
@orfeas-k
Copy link
Contributor Author

Regarding the issue above, in the process of updating the charm's manifests, I 've hit the same issues when running the tests. The logs I see in seldon-core-operator

2023-09-29T11:23:28.646Z [seldon-core] {"level":"info","ts":1695986608.64613,"logger":"controllers.SeldonDeployment","msg":"Difference in SVCs:   &v1.Service{\n  \tTypeMeta: {Kind: \"Service\", APIVersion: \"v1\"},\n  \tObjectMeta: v1.ObjectMeta{\n  \t\t... // 8 identical fields\n  \t\tDeletionTimestamp:          nil,\n  \t\tDeletionGracePeriodSeconds: nil,\n  \t\tLab
2023-09-29T11:23:28.646Z [seldon-core] {"level":"info","ts":1695986608.64613,"logger":"controllers.SeldonDeployment","msg":"Difference in SVCs:   &v1.Service{\n  \tTypeMeta: {Kind: \"Se
2023-09-29T11:23:28.646Z [seldon-core] {"level":"info","ts":1695986608.64613,"logger":"controllers.SeldonDeployment","msg":"Difference in SVCs:   &v1.Service{\n  \tTypeMeta: {Kind: \"Se
rvice\", APIVersion: \"v1\"},\n  \tObjectMeta: v1.ObjectMeta{\n  \t\t... // 8 identical fields\n  \t\tDeletionTimestamp:          nil,\n  \t\tDeletionGracePeriodSeconds: nil,\n  \t\tLab
els: map[string]string{\n+ \t\t\t\"app.juju.is/created-by\":       \"seldon-controller-manager\",\n  \t\t\t\"app.kubernetes.io/managed-by\": \"seldon-core\",\n  \t\t\t\"seldon-app\":                   \"seldon-model-1-example\",\n  \t\t\t... // 3 identical entries\n  \t\t},\n- \t\tAnnotations:     map[string]string{},\n+ \t\tAnnotations:     nil,\n  \t\tOwnerReferences: {{APIVersion: \"machinelearning.seldon.io/v1\", Kind: \"SeldonDeployment\", Name: \"seldon-model-1\", UID: \"51137348-039c-41bd-948c-5586320d0e7c\", ...}},\n  \t\tFinalizers:      nil,\n  \t\t... // 2 identical fields\n  \t},\n  \tSpec:   {Ports: {{Name: \"http\", Protocol: \"TCP\", Port: 9000, TargetPort: {IntVal: 9000}, ...}, {Name: \"grpc\", Protocol: \"TCP\", Port: 9500, TargetPort: {IntVal: 9500}, ...}}, Selector: {\"seldon-app-svc-classifier\": \"seldon-model-1-example-classifier\"}, ClusterIP: \"10.152.183.67\", ClusterIPs: {\"10.152.183.67\"}, ...},\n  \tStatus: {},\n  }\n","SeldonDeployment":"test-charm-jxa1/seldon-model-1"}
2023-09-29T11:23:28.646Z [seldon-core] {"level":"info","ts":1695986608.6461751,"logger":"controllers.SeldonDeployment","msg":"Updating Service","SeldonDeployment":"test-charm-jxa1/seldon-model-1","all":false,"namespace":"test-charm-jxa1","name":"seldon-model-1-example"}
2023-09-29T11:23:28.694Z [seldon-core] {"level":"info","ts":1695986608.6946557,"logger":"controllers.SeldonDeployment","msg":"Difference in SVCs:   &v1.Service{\n  \tTypeMeta: {Kind: \"Service\", APIVersion: \"v1\"},\n  \tObjectMeta: v1.ObjectMeta{\n  \t\t... // 8 identical fields\n  \t\tDeletionTimestamp:          nil,\n  \t\tDeletionGracePeriodSeconds: nil,\n  \t\tLabels: map[string]string{\n+ \t\t\t\"app.juju.is/created-by\":       \"seldon-controller-manager\",\n  \t\t\t\"app.kubernetes.io/managed-by\": \"seldon-core\",\n  \t\t\t\"seldon-app\":                   \"seldon-model-1-example\",\n  \t\t\t\"seldon-deployment-id\":         \"seldon-model-1\",\n  \t\t},\n- \t\tAnnotations:     map[string]string{},\n+ \t\tAnnotations:     nil,\n  \t\tOwnerReferences: {{APIVersion: \"machinelearning.seldon.io/v1\", Kind: \"SeldonDeployment\", Name: \"seldon-model-1\", UID: \"51137348-039c-41bd-948c-5586320d0e7c\", ...}},\n  \t\tFinalizers:      nil,\n  \t\t... // 2 identical fields\n  \t},\n  \tSpec:   {Ports: {{Name: \"http\", Protocol: \"TCP\", Port: 8000, TargetPort: {IntVal: 8000}, ...}, {Name: \"grpc\", Protocol: \"TCP\", Port: 5001, TargetPort: {IntVal: 5001}, ...}}, Selector: {\"seldon-app\": \"seldon-model-1-example\"}, ClusterIP: \"10.152.183.7\", ClusterIPs: {\"10.152.183.7\"}, ...},\n  \tStatus: {},\n  }\n","SeldonDeployment":"test-charm-jxa1/seldon-model-1"}
2023-09-29T11:23:28.694Z [seldon-core] {"level":"info","ts":1695986608.6947172,"logger":"controllers.SeldonDeployment","msg":"Scheme","SeldonDeployment":"test-charm-jxa1/seldon-model-1","r.scheme":{}}
2023-09-29T11:23:28.694Z [seldon-core] {"level":"info","ts":1695986608.6947296,"logger":"controllers.SeldonDeployment","msg":"createDeployments","SeldonDeployment":"test-charm-jxa1/seldon-model-1","deploy":{"namespace":"test-charm-jxa1","name":"seldon-model-1-example-0-classifier"}}
2023-09-29T11:23:28.697Z [seldon-core] {"level":"info","ts":1695986608.6978538,"logger":"controllers.SeldonDeployment","msg":"Updating Deployment","SeldonDeployment":"test-charm-jxa1/seldon-model-1","namespace":"test-charm-jxa1","name":"seldon-model-1-example-0-classifier"}
2023-09-29T11:23:28.707Z [seldon-core] {"level":"info","ts":1695986608.7078805,"logger":"controllers.SeldonDeployment","msg":"The deployments are the same - api server defaults ignored","SeldonDeployment":"test-charm-jxa1/seldon-model-1"}
2023-09-29T11:23:28.707Z [seldon-core] {"level":"info","ts":1695986608.7078938,"logger":"controllers.SeldonDeployment","msg":"Found identical deployment","SeldonDeployment":"test-charm-jxa1/seldon-model-1","namespace":"test-charm-jxa1","name":"seldon-model-1-example-0-classifier","status":{"observedGeneration":7,"replicas":1,"updatedReplicas":1,"unavailableReplicas":1,"conditions":[{"type":"Available","status":"False","lastUpdateTime":"2023-09-29T11:09:25Z","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"MinimumReplicasUnavailable","message":"Deployment does not have minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2023-09-29T11:09:25Z","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"ReplicaSetUpdated","message":"ReplicaSet \"seldon-model-1-example-0-classifier-578bdbbf7d\" is progressing."}]}}
2023-09-29T11:23:28.707Z [seldon-core] {"level":"info","ts":1695986608.707915,"logger":"controllers.SeldonDeployment","msg":"Deployment status","SeldonDeployment":"test-charm-jxa1/seldon-model-1","name":"seldon-model-1-example-0-classifier","status":{"observedGeneration":7,"replicas":1,"updatedReplicas":1,"unavailableReplicas":1,"conditions":[{"type":"Available","status":"False","lastUpdateTime":"2023-09-29T11:09:25Z","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"MinimumReplicasUnavailable","message":"Deployment does not have minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2023-09-29T11:09:25Z","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"ReplicaSetUpdated","message":"ReplicaSet \"seldon-model-1-example-0-classifier-578bdbbf7d\" is progressing."}]}}
2023-09-29T11:23:28.707Z [seldon-core] {"level":"info","ts":1695986608.707929,"logger":"controllers.SeldonDeployment","msg":"Updating availableCondition for deployment","SeldonDeployment":"test-charm-jxa1/seldon-model-1","name":"seldon-model-1-example-0-classifier","availableCondition":{"type":"Available","status":"False","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"MinimumReplicasUnavailable","message":"Deployment does not have minimum availability."}}
2023-09-29T11:23:28.707Z [seldon-core] {"level":"info","ts":1695986608.7079446,"logger":"controllers.SeldonDeployment","msg":"Inference status","SeldonDeployment":"test-charm-jxa1/seldon-model-1","status":{"state":"Creating","deploymentStatus":{"seldon-model-1-example-0-classifier":{"replicas":1}},"replicas":1,"address":{"url":"http://seldon-model-1-example.test-charm-jxa1.svc.cluster.local:8000/api/v1.0/predictions"},"conditions":[{"type":"DeploymentsReady","status":"False","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"MinimumReplicasUnavailable","message":"Deployment does not have minimum availability."},{"type":"HpasReady","status":"True","lastTransitionTime":"2023-09-29T11:09:24Z","reason":"No HPAs defined"},{"type":"KedaReady","status":"True","lastTransitionTime":"2023-09-29T11:09:24Z","reason":"No KEDA resources defined"},{"type":"PdbsReady","status":"True","lastTransitionTime":"2023-09-29T11:09:25Z","reason":"No PDBs defined"},{"type":"Ready","status":"False","lastTransitionTime":"2023-09-29T11:23:28Z","reason":"MinimumReplicasUnavailable","message":"Deployment does not have minimum availability."},{"type":"ServicesReady","status":"False","lastTransitionTime":"2023-09-29T11:09:24Z","reason":"Not all services created"}]}}
2023-09-29T11:23:28.708Z [seldon-core] {"level":"info","ts":1695986608.708531,"logger":"controllers.SeldonDeployment","msg":"Reconcile called","SeldonDeployment":"test-charm-jxa1/seldon-model-1"}
2023-09-29T11:23:28.708Z [seldon-core] {"level":"info","ts":1695986608.7085607,"logger":"seldondeployment","msg":"Defaulting Seldon Deployment called","name":"seldon-model-1"}
2023-09-29T11:23:28.708Z [seldon-core] {"level":"info","ts":1695986608.7085721,"logger":"controllers.SeldonDeployment","msg":"pSvcName","SeldonDeployment":"test-charm-jxa1/seldon-model-1","val":"seldon-model-1-example"}
2023-09-29T11:23:28.708Z [seldon-core] {"level":"info","ts":1695986608.708683,"logger":"controllers.SeldonDeployment","msg":"Updating Service","SeldonDeployment":"test-charm-jxa1/seldon-model-1","all":false,"namespace":"test-charm-jxa1","name":"seldon-model-1-example-classifier"}

orfeas-k added a commit that referenced this issue Jan 9, 2024
- Skip doing rockcraft.yaml updates since release 1.17.1 still uses `imageVersion` 2.1.0
- Update the `version` as per canonical/bundle-kubeflow/#747
- update `base` since using `:` is deprecated
- refactor `tox.ini` according to canonical/oidc-authservice-rock#14 and canonical/bundle-kubeflow#763
- update `test_rock.py` according to latest changes in chisme canonical/charmed-kubeflow-chisme#81

Refs #37
orfeas-k added a commit that referenced this issue Jan 9, 2024
…66)

- Update ROCK according to upstream changes
- Introduce parts that we missed in the ROCK
- use ubuntu 20.04 as base due to #39
- refactor tox.ini according to canonical/oidc-authservice-rock#14 and canonical/bundle-kubeflow#763
- update `test_rock.py` according to latest changes in chisme canonical/charmed-kubeflow-chisme#81

Details for changes in #37.
Closes #54
@orfeas-k
Copy link
Contributor Author

MLServer-* command

We will omit source /hack/activate-env.sh part from the command. It is not needed since the appropriate files do not exist in the image and we see the following logs (same things happens in upstream image too)

Environment tarball not found at '/mnt/models/environment.tar.gz'
Environment not found at './envs/environment'

Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5186.

This message was autogenerated

orfeas-k added a commit that referenced this issue Jan 11, 2024
- pin starlette due to #80
- remove redundant `cp` part.
- remove `source` part from command according to #37
- tox.ini: unpin chisme
orfeas-k added a commit that referenced this issue Jan 11, 2024
Update ROCK according to upstream changes plus:
- introduce parts that we missed in the ROCK
- use ubuntu 20.04 as base due to #39
- refactor tox.ini according to canonical/oidc-authservice-rock#14 and canonical/bundle-kubeflow#763
- update `test_rock.py` according to latest changes in chisme canonical/charmed-kubeflow-chisme#81
- pins starlette due to #80

Ref #37
Closes #53
orfeas-k added a commit that referenced this issue Jan 11, 2024
Update ROCK according to upstream changes plus:
- introduce parts that we missed in the ROCK
- use ubuntu 20.04 as base due to #39
- refactor tox.ini according to canonical/oidc-authservice-rock#14 and canonical/bundle-kubeflow#763
- update `test_rock.py` according to latest changes in chisme canonical/charmed-kubeflow-chisme#81
- pins starlette due to #80

Ref #37
Closes #51
orfeas-k added a commit that referenced this issue Jan 11, 2024
Update ROCK according to upstream changes plus:

- introduce parts that we missed in the ROCK
- use ubuntu 20.04 as base due to #39
- refactor tox.ini according to canonical/oidc-authservice-rock#14 and canonical/bundle-kubeflow#763
- update `test_rock.py` according to latest changes in chisme canonical/charmed-kubeflow-chisme#81
- pins starlette due to #80

Ref #37
Closes #52
@orfeas-k
Copy link
Contributor Author

All ROCKs have been updated in linked PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release
Projects
2 participants