Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric "kube_node_role" missing (AKS) #2474

Open
R-Studio opened this issue Aug 14, 2024 · 4 comments
Open

Metric "kube_node_role" missing (AKS) #2474

R-Studio opened this issue Aug 14, 2024 · 4 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@R-Studio
Copy link

R-Studio commented Aug 14, 2024

What happened:

  • We installed the kube-prometheus-stack Helm chart 61.8.0 on our AKS cluster (1.29.6) with kube-state-metrics enabled.
  • It looks like it works but the metric kube_node_role is missing because the label node-role.kubernetes.io/* is missing.
  • In AKS node-role.kubernetes.io/* is deprecated and the recommended substitute is kubernetes.azure.com/role=* (more in the docs)
  • Unfortunately we are not able and it is not allowed because the following prefixes are AKS reserved prefixes and can't be used for any node (docs):
    • kubernetes.azure.com/
    • kubernetes.io/

What you expected to happen:

  • The metric kube_node_role is available.

How to reproduce it (as minimally and precisely as possible):

  • Install the kube-prometheus-stack Helm chart 61.8.0 on our AKS cluster (1.29.6) (no special values).
  • Create a port-forward to the kube-state-metrics pod and open http://localhost:8080/metrics.
  • Search the metric kube_node_role and you can see there is only the description and help of the metric but no metric data:
    ...
    # HELP kube_node_role The role of a cluster node.
    # TYPE kube_node_role gauge
    ...
    

Anything else we need to know?:

  • It looks like we are not able to tell kube-state-metrics to use the label kubernetes.azure.com/role=* instead of node-role.kubernetes.io/*, because it is hardcoded.
  • For us the metric kube_node_role is very important, because we have a lot of recording- and alertingrules that uses it.

Environment:

  • kube-state-metrics version: v2.13.0
  • Kubernetes version (use kubectl version): 1.29.6
  • Cloud provider or hardware configuration: AKS
@R-Studio R-Studio added the kind/bug Categorizes issue or PR as related to a bug. label Aug 14, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 14, 2024
@R-Studio
Copy link
Author

Workaround / Solution

It looks like, it is not allowed to set node-role.kubernetes.io/agent* but you can set any other role if you want.
For example we set the following labels on the two node pools:

  • Control-Plane Nodes = node-role.kubernetes.io/control-plane: 'true'
  • Worker Nodes = node-role.kubernetes.io/worker: 'true'

@R-Studio
Copy link
Author

Workaround not working!

Unfortunately our workaround is not really working, because we cannot set the labels on the AKS node pools. Here the answer of Microsoft support:
The error message you're seeing indicates that you're trying to set a label on a Kubernetes node pool using a key that has a prefix reserved for Kubernetes' internal use. Labels with prefixes like kubernetes.io/ are reserved for use by Kubernetes itself and should not be used for custom labels.

@R-Studio R-Studio reopened this Sep 11, 2024
@dgrisonnet
Copy link
Member

/triage accepted
/assign

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 19, 2024
@abossard
Copy link

abossard commented Nov 6, 2024

I opened a related issue in the AKS project: Azure/AKS#4628

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants