Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error getting active endpoint: revision.serving.knative.dev "" not found when doing Serverless InferenceServing #168

Closed
DnPlas opened this issue Nov 30, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@DnPlas
Copy link
Contributor

DnPlas commented Nov 30, 2023

Bug Description

I have created a Serverless Kserve InferenceService and tried performing inference, but instead of getting a prediction, I get:

Error getting active endpoint: revision.serving.knative.dev "" not found
* Connection #0 to host 10.152.183.40 left intact

To Reproduce

  1. juju deploy kubeflow 1.8/stable --trust
  2. Apply the following InferenceService in a Profile namespace or on default (the result is the same)
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-iris-ckf-1-8"
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
  1. Wait for the isvc to be ready `
$ kubectl get isvc -ndaniela
NAME                   URL                                                       READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION                    AGE
sklearn-iris-ckf-1-8   http://sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io   True           100                              sklearn-iris-ckf-1-8-predictor-00001   21m
  1. Perform inference on the ClusterIP and on the isvc URL

The `iris-output.json comes from here

# Using the ClusterIP
ubuntu@charm-dev-jammy:~$ kubectl get svc -ndaniela | grep sklearn
sklearn-iris-ckf-1-8-predictor-00001           ClusterIP      10.152.183.40    <none>   
ubuntu@charm-dev-jammy:~$ curl -v -H "Content-Type: application/json" 10.152.183.40/v1/models/sklearn-iris:predict -d @./iris-input.json
*   Trying 10.152.183.40:80...
* Connected to 10.152.183.40 (10.152.183.40) port 80 (#0)
> POST /v1/models/sklearn-iris:predict HTTP/1.1
> Host: 10.152.183.40
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 76
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Thu, 30 Nov 2023 13:04:59 GMT
< Content-Length: 73
< 
Error getting active endpoint: revision.serving.knative.dev "" not found
* Connection #0 to host 10.152.183.40 left intact

# Using the isvc URL
ubuntu@charm-dev-jammy:~$ kubectl get isvc -ndaniela
NAME                   URL                                                       READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION                    AGE
sklearn-iris-ckf-1-8   http://sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io   True           100                              sklearn-iris-ckf-1-8-predictor-00001   24m

ubuntu@charm-dev-jammy:~$ curl -v -H "Content-Type: application/json" http://sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io/v1/models/sklearn-iris:predict -d @./iris-input.json
*   Trying 10.64.140.43:80...
* Connected to sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io (10.64.140.43) port 80 (#0)
> POST /v1/models/sklearn-iris:predict HTTP/1.1
> Host: sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 76
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< location: http://10.64.140.43.nip.io/dex/auth?client_id=authservice-oidc&redirect_uri=%2Fauthservice%2Foidc%2Fcallback&response_type=code&scope=openid+profile+email+groups&state=MTcwMTM0OTU4NnxOd3dBTkZkVVdrWk9UVnBIV1VGRVExSk1Uak5ZV1RjeVRsazBOMVZHVjAxUU4wOUJUVm95UmtSTlJUTkdRVGRVV2taTlZVUklOVUU9fMEGXL81whT_EM5vSRMamOezFLz9hr5HMxsb-7Az2PlD
< set-cookie: oidc_state_csrf=MTcwMTM0OTU4NnxOd3dBTkZkVVdrWk9UVnBIV1VGRVExSk1Uak5ZV1RjeVRsazBOMVZHVjAxUU4wOUJUVm95UmtSTlJUTkdRVGRVV2taTlZVUklOVUU9fMEGXL81whT_EM5vSRMamOezFLz9hr5HMxsb-7Az2PlD; Path=/; Expires=Thu, 21 May 2054 13:59:55 GMT; Max-Age=1200000000000
< date: Thu, 30 Nov 2023 13:06:26 GMT
< x-envoy-upstream-service-time: 4
< server: istio-envoy
< content-length: 0
< 
* Connection #0 to host sklearn-iris-ckf-1-8.daniela.10.64.140.43.nip.io left intact

Environment

  • Microk8s 1.25-strict/stable
  • juju 3.1
  • Charmed Kubeflow 1.8/stable

Relevant Log Output

Checking the knative-serving activator pod logs:

{"severity":"ERROR","timestamp":"2023-11-30T13:09:55.657583568Z","logger":"activator","caller":"handler/context_handler.go:74","message":"Error while getting revision","commit":"500756c","knative.dev/controller":"activator","knative.dev/pod":"activator-6dd4b8cc79-zb6lv","knative.dev/key":"/","error":"revision.serving.knative.dev \"\" not found","stacktrace":"knative.dev/serving/pkg/activator/handler.(*contextHandler).ServeHTTP\n\tknative.dev/serving/pkg/activator/handler/context_handler.go:74\nknative.dev/serving/pkg/activator/handler.(*ProbeHandler).ServeHTTP\n\tknative.dev/serving/pkg/activator/handler/probe_handler.go:46\nknative.dev/networking/pkg/http/probe.(*handler).ServeHTTP\n\tknative.dev/networking@v0.0.0-20230419144338-e5d04e805e50/pkg/http/probe/handler.go:39\nknative.dev/serving/pkg/activator/handler.(*HealthHandler).ServeHTTP\n\tknative.dev/serving/pkg/activator/handler/healthz_handler.go:44\ngolang.org/x/net/http2/h2c.h2cHandler.ServeHTTP\n\tgolang.org/x/net@v0.7.0/http2/h2c/h2c.go:125\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2936\nnet/http.(*conn).serve\n\tnet/http/server.go:1995"}

Additional Context

  • The isvc's readiness is True
  • The ksvc's readiness is True
  • I can see the isvc Pod present in the namespace, but no ksvc one
@DnPlas DnPlas added the bug Something isn't working label Nov 30, 2023
@kimwnasptd
Copy link
Contributor

The issue describes about hitting the SVC directly of the ISVC, but this should not be the way to reach the ISVC. We should always use the high level SVC, which is an ExternalName one, that handles the redirections via VirtualServices and the IGWs.

@DnPlas I propose that we close this issue, since it doesn't follow the standard way for reaching KServe ISVCs. Let's ensure we document the process users should hit KServe in canonical/kserve-operators#205 and in a new one for explaining internal traffic.

Then if in the future we see the same message, but from hitting the ExternalName Service then we can re-open

Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5318.

This message was autogenerated

@DnPlas
Copy link
Contributor Author

DnPlas commented Feb 12, 2024

@kimwnasptd totally agree, I was going to add a comment similar to what you posted. Closing because of this.

@DnPlas DnPlas closed this as completed Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants