Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatched datadir volumeMount causing zookeeper dataloss after pod recreation #2812

Open
1 of 3 tasks
Kiritow opened this issue Nov 11, 2024 · 0 comments
Open
1 of 3 tasks
Labels
kind/k8s Related to the Kubernetes application needs-triage This wasn't investigate by the repo's owners yet

Comments

@Kiritow
Copy link

Kiritow commented Nov 11, 2024

Category:

Kubernetes apps

Type:

  • Bug
  • Feature Request
  • Process

We are using the GKE click-to-deploy feature to deploy Kafka in our cluster, but after the underlying nodes are replaced, Kafka fails to start properly and throws the following error:

ERROR Exiting Kafka due to fatal exception during startup. (kafka.Kafka$)
java.lang.RuntimeException: Invalid cluster.id in: /kafka/logs/meta.properties. Expected <redacted>, but read <redacted>

After our investigation, we found that there might be an issue with the Zookeeper configuration in the chart. The datadir volumeMount configuration for Zookeeper is inconsistent with the value of the ZK_DATA_DIR environment variable, causing data loss in Zookeeper after pod migration.

In k8s/kafka/chart/kafka/templates/zk-statefulset.yaml, ZK_DATA_DIR is set to /data but volumeMounts is configured as:

volumeMounts:
- name: config
  mountPath: /config-scripts
- name: datadir
  mountPath: /opt/zookeeper

when we checked the /opt/zookeeper folder inside the pod, we found it was empty.

@Kiritow Kiritow added kind/k8s Related to the Kubernetes application needs-triage This wasn't investigate by the repo's owners yet labels Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/k8s Related to the Kubernetes application needs-triage This wasn't investigate by the repo's owners yet
Projects
None yet
Development

No branches or pull requests

1 participant