Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller not sending manifests to resource dispatcher #262

Open
misohu opened this issue Aug 9, 2024 · 3 comments
Open

Controller not sending manifests to resource dispatcher #262

misohu opened this issue Aug 9, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@misohu
Copy link
Member

misohu commented Aug 9, 2024

Bug Description

After deploying CKF 1.9/stable with resource dispatcher 2.0/stable the expected manifests for secrets and service-accounts are missing in the relation. The same relation is working for mlflow and resource dispatcher.

To Reproduce

  1. Deploy CKF 1.9 stable
  2. Deploy Resource-dispatcher 2.0 stable
  3. relate
juju relate kserve-controller:secrets resource-dispatcher:secrets
juju relate kserve-controller:service-accounts resource-dispatcher:service-accounts
  1. Check the unit of resource dispatcher for relation data
juju show-unit resource-dispatcher/0

Environment

microk8s 1.29.5
juju 3.4
ckf 1.9/stable
resource-dispatcher 2.0/stable

Relevant Log Output

resource-dispatcher/0:
  opened-ports: []
  charm: ch:amd64/focal/resource-dispatcher-182
  leader: true
  life: alive
  relation-info:
  - relation-id: 63
    endpoint: pod-defaults
    related-endpoint: pod-defaults
    application-data:
      kubernetes_manifests: '[{"apiVersion": "kubeflow.org/v1alpha1", "kind": "PodDefault",
        "metadata": {"name": "mlflow-server-access-minio"}, "spec": {"desc": "Allow
        access to Minio", "selector": {"matchLabels": {"access-minio": "true"}}, "env":
        [{"name": "AWS_ACCESS_KEY_ID", "valueFrom": {"secretKeyRef": {"name": "mlflow-server-minio-artifact",
        "key": "AWS_ACCESS_KEY_ID", "optional": false}}}, {"name": "AWS_SECRET_ACCESS_KEY",
        "valueFrom": {"secretKeyRef": {"name": "mlflow-server-minio-artifact", "key":
        "AWS_SECRET_ACCESS_KEY", "optional": false}}}, {"name": "MINIO_ENDPOINT_URL",
        "value": "http://mlflow-minio.kubeflow:9000"}]}}, {"apiVersion": "kubeflow.org/v1alpha1",
        "kind": "PodDefault", "metadata": {"name": "mlflow-server-minio"}, "spec":
        {"desc": "Allow access to MLFlow", "env": [{"name": "MLFLOW_S3_ENDPOINT_URL",
        "value": "http://mlflow-minio.kubeflow:9000"}, {"name": "MLFLOW_TRACKING_URI",
        "value": "http://mlflow-server.kubeflow.svc.cluster.local:5000"}], "selector":
        {"matchLabels": {"mlflow-server-minio": "true"}}}}]'
    related-units:
      mlflow-server/0:
        in-scope: true
        data:
          egress-subnets: 10.152.183.141/32
          ingress-address: 10.152.183.141
          private-address: 10.152.183.141
  - relation-id: 62
    endpoint: secrets
    related-endpoint: secrets
    application-data:
      kubernetes_manifests: '[{"apiVersion": "v1", "kind": "Secret", "metadata": {"name":
        "mlflow-server-minio-artifact"}, "stringData": {"AWS_ACCESS_KEY_ID": "minio",
        "AWS_SECRET_ACCESS_KEY": "5ZGBN5Y5ZUF1XN2ZZLJJZOLV4GIB30"}}, {"apiVersion":
        "v1", "kind": "Secret", "metadata": {"name": "mlflow-server-seldon-rclone-secret"},
        "stringData": {"RCLONE_CONFIG_S3_TYPE": "s3", "RCLONE_CONFIG_S3_PROVIDER":
        "minio", "RCLONE_CONFIG_S3_ENV_AUTH": "false", "RCLONE_CONFIG_S3_ACCESS_KEY_ID":
        "minio", "RCLONE_CONFIG_S3_SECRET_ACCESS_KEY": "5ZGBN5Y5ZUF1XN2ZZLJJZOLV4GIB30",
        "RCLONE_CONFIG_S3_ENDPOINT": "http://mlflow-minio.kubeflow:9000"}}]'
    related-units:
      mlflow-server/0:
        in-scope: true
        data:
          egress-subnets: 10.152.183.141/32
          ingress-address: 10.152.183.141
          private-address: 10.152.183.141
  - relation-id: 67
    endpoint: secrets
    related-endpoint: secrets
    application-data: {}
    related-units:
      kserve-controller/0:
        in-scope: true
        data:
          egress-subnets: 10.152.183.219/32
          ingress-address: 10.152.183.219
          private-address: 10.152.183.219
  - relation-id: 66
    endpoint: service-accounts
    related-endpoint: service-accounts
    application-data: {}
    related-units:
      kserve-controller/0:
        in-scope: true
        data:
          egress-subnets: 10.152.183.219/32
          ingress-address: 10.152.183.219
          private-address: 10.152.183.219
  provider-id: resource-dispatcher-0
  address: 10.1.100.230

Additional Context

No response

@misohu misohu added the bug Something isn't working label Aug 9, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6114.

This message was autogenerated

@misohu
Copy link
Member Author

misohu commented Aug 13, 2024

I have locally rerun integration tests for kserve-controller which include the test of integration with resource-dispatcher but I can clearly see that the relation data is filled with manifests. I have tried with resource-dispatcher 2.0/stable but still no problems. I have also tried to deploy the kserve from 1.9 bundle in my local computer but still no problems. I have tried also to change the minio as that one is sending credentials but still no problem.

Looks like the problem only occurs when bundle is deployed as whole.

@misohu
Copy link
Member Author

misohu commented Aug 15, 2024

The problem was missing relation between mlflow-minio and kserver-cotroller. This was also missing in the docs thats why we missed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant