Fix: Refactor ServiceMonitor template to avoid duplicates across releases #742
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
This pull request refactors the ServiceMonitor template in the APISIX Helm Chart to ensure that only one ServiceMonitor is created per namespace, even if there are multiple releases of the Chart.
Problem:
Currently, each release of the APISIX Helm Chart creates its own ServiceMonitor. This can lead to duplicate ServiceMonitors and unnecessary overhead in Prometheus when there are multiple releases of the Chart in the same namespace.
Solution:
We've utilized Helm's hook mechanism to ensure that only one ServiceMonitor is created per namespace, regardless of the number of releases.
Changes:
Removed release-specific labels and selectors from the ServiceMonitor template. The ServiceMonitor now selects all services with the labels app.kubernetes.io/name: {{ include "apisix.name" . }} and app.kubernetes.io/service: apisix-gateway, regardless of the release.
Added pre-install and post-install hooks to the ServiceMonitor. Before creating a new ServiceMonitor, these hooks check if one already exists with the name apisix-service-monitor. If it exists, it's deleted before creating a new one. This ensures that there's always only one ServiceMonitor.
Set the hook deletion policy to before-hook-creation and hook-succeeded. This means that the hook resource is deleted before a new one is created (if one already exists), and it's also deleted after the ServiceMonitor is successfully created.
Implications:
All releases of the APISIX Helm Chart in the same namespace will share the same ServiceMonitor. This is suitable when all releases can share the same ServiceMonitor configuration.
If a release is uninstalled, it won't delete the ServiceMonitor, as it may still be used by other releases.
If the ServiceMonitor is manually deleted, upgrading a release will recreate it, but other releases might need a manual upgrade to reconnect to the new ServiceMonitor.
This approach can work well in many cases, especially when you have multiple releases that can all share the same ServiceMonitor configuration. For more complex scenarios, other strategies like using relabeling in the Prometheus configuration to handle different ServiceMonitors might be considered.