-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Readiness Probe in multi-node etcd #7
Comments
I'm open to change if we found new way to implement the ReadinessProbe for etcd. |
Thanks for the summary @ishan16696. I have two comments or suggestions in general:
|
I didn’t get how |
added One more method: |
My suggestion is just to hide the readiness checks (whatever they are) behind an endpoint because then we are and will be very flexible about the implementation behind it w/o changing the pod spec. Please lets not write something like That endpoint can be called |
yes, that’s why I updated the description of issue with one more approach of
make sense |
Discussed with @ishan16696, @shreyas-s-rao, @aaronfern the following plan to proceed:
|
As discussed in our internal sync that right now we have many unknowns since we don't know the what edge cases might arise in production w.r.t cc @timuthy @abdasgupta @aaronfern /on-hold |
This issue needs to be revisited after gardener/etcd-druid#445 etcd version upgrade. |
Feature (What you would like to be added):
Currently, readinessProbe of etcd is set to an endpoint
/healthz
of HTTP server running in a backup sidecar.This behaviour needed to be updated or improved as readinessProbe of
clustered-etcd
should depend on whether there isetcd-leader
present or not then only it should serve the incoming write requests.Motivation (Why is this needed?):
Approach/Hint to the implement solution (optional):
Approaches :
ETCDCTL_API=3 etcdctl endpoint health --endpoints=${ENDPOINTS} --command-timeout=Xs
etcdctl endpoint health
command performs a GET on the "health" key(source)Advantages of this Method (
etcdctl endpoint health
).etcd-members
asNotReady
and they won't able to serve the write as well as read requests.Disadvantages of this Method (
etcdctl endpoint health
)./healthz
of HTTP server because when Owner check fails it fails the readinessProbe of etcd by setting the HTTP status to 503 but this Owner check in multi-node scenario is already being discussed here./health
endpoint of etcd./health
endpoint returnsfalse
if one of the following conditions is met (source):Advantages and Disadvantage of Method 2 (
/health
endpoint).Use endpoint
/healthz
of HTTP server running in backup sidecar with modifications in such a way that wheneverbackup-restore leader
is elected it should setHTTP server status to 200
for itself as well for allbackup-restore followers
and set theHTTP server status to 503
when there is no etcd-leader present.Advantages of this Method (
/healthz
).snapshotter
of backup sidecar andreadinessProbe
of etcd, backup sidecar will able to control when to let the traffic come in for etcd.Disadvantages of this Method (
/healthz
).Future Scope:
readinessProbe
from backup-sidecar and switch to gRPC instead of sending REST requests.The text was updated successfully, but these errors were encountered: