Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kserve-operator goes to error with "Missing gateway-info relation" #279

Open
kimwnasptd opened this issue Nov 28, 2024 · 1 comment
Open
Labels
bug Something isn't working

Comments

@kimwnasptd
Copy link
Contributor

Bug Description

When trying to delete my whole model and all the charms the kserve-operator charm was left behind with the following status:

Unit                  Workload  Agent  Address     Ports  Message
kserve-controller/0*  error     idle   10.1.56.12         hook failed: "remove"

After inspecting the logs it looks like the root cause is that the istio-pilot charm, that provides the gateway-info endpoint is removed first.

The charm should be able to handle the GatewayRelationMissingError raised by the library when it's cleaning up.

To Reproduce

  1. Deploy Kubeflow
  2. First delete istio-pilot
  3. Try to delete kserve-operator

Environment

CKF 1.9

Relevant Log Output

unit-kserve-controller-0: 12:07:26 WARNING unit.kserve-controller/0.juju-log 2 containers are present in metadata.yaml and refresh_event was not specified. Defaulting to update_status. Metrics IP may not be set in a timely fashion.
unit-kserve-controller-0: 12:07:26 INFO unit.kserve-controller/0.juju-log Rendering manifests
unit-kserve-controller-0: 12:07:30 ERROR unit.kserve-controller/0.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 565, in _generate_gateways_context
    ingress_gateway_info = self._ingress_gateway_info
  File "./src/charm.py", line 286, in _ingress_gateway_info
    return self._ingress_gateway_requirer.get_relation_data()
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 206, in get_relation_data
    self._relation_preflight_checks(relation=relation)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/lib/charms/istio_pilot/v0/istio_gateway_info.py", line 178, in _relation_preflight_checks
    raise GatewayRelationMissingError()
charms.istio_pilot.v0.istio_gateway_info.GatewayRelationMissingError: Missing gateway-info relation.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 686, in <module>
    main(KServeControllerCharm)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 548, in main
    manager.run()
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 527, in run
    self._emit()
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 516, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/main.py", line 147, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 348, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 860, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-kserve-controller-0/charm/venv/ops/framework.py", line 950, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 525, in _on_remove
    cm_resources_manifests = self.cm_resource_handler.render_manifests()
  File "./src/charm.py", line 242, in cm_resource_handler
    context={**self._inference_service_context, **self.images_context},
  File "./src/charm.py", line 219, in _inference_service_context
    gateways_context = self._generate_gateways_context()
  File "./src/charm.py", line 567, in _generate_gateways_context
    raise ErrorWithStatus("Please relate to istio-pilot:gateway-info", BlockedStatus)
charmed_kubeflow_chisme.exceptions._with_status.ErrorWithStatus: Please relate to istio-pilot:gateway-info
unit-kserve-controller-0: 12:07:30 ERROR juju.worker.uniter.operation hook "remove" (via hook dispatching script: dispatch) failed: exit status 1

Additional Context

No response

@kimwnasptd kimwnasptd added the bug Something isn't working label Nov 28, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6611.

This message was autogenerated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant