-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple backup ready condition for snapshot leases and remove support for deprecated fields in etcd status #820
base: master
Are you sure you want to change the base?
Conversation
715260b
to
d0f3163
Compare
/retest |
1 similar comment
/retest |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove Cluster Size
and Backup Ready
.
etcd-druid/api/v1alpha1/etcd.go
Line 53 in d0f3163
// +kubebuilder:printcolumn:name="Cluster Size",type=integer,JSONPath=`.spec.replicas`,priority=1 |
etcd-druid/api/v1alpha1/etcd.go
Line 51 in d0f3163
// +kubebuilder:printcolumn:name="Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="BackupReady")].status` |
Add FullSnapshotBackupReady
and DeltaSnapshotBackupReady
as part of print columns
@seshachalam-yv Nice find, thanks. Done |
c696b9c
to
4e2d378
Compare
4e2d378
to
98b46de
Compare
…cated fields in etcd.status
9036d63
to
b0b9725
Compare
/test pull-etcd-druid-e2e-kind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to pair program this. Current code needs a LOT of changes.
// +kubebuilder:printcolumn:name="Current Replicas",type=integer,JSONPath=`.status.currentReplicas`,priority=1 | ||
// +kubebuilder:printcolumn:name="Ready Replicas",type=integer,JSONPath=`.status.readyReplicas`,priority=1 | ||
// +kubebuilder:printcolumn:name="Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="BackupReady")].status` | ||
// +kubebuilder:printcolumn:name="Full Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="FullSnapshotBackupReady")].status` | ||
// +kubebuilder:printcolumn:name="Delta Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="DeltaSnapshotBackupReady")].status` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call them Delta Backup Ready
or Delta Snapshot Backup Ready
? Similarly for full snapshot.
@@ -79,6 +79,8 @@ Status fields related to the etcd cluster itself, such as `Members`, `PeerUrlTLS | |||
- `AllMembersReady`: indicates readiness of all members of the etcd cluster. | |||
- `Ready`: indicates overall readiness of the etcd cluster in serving traffic. | |||
- `BackupReady`: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | |||
- `FullSnapshotBackupReady`: indicates health of the full snapshot backups, i.e whether full snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which situation will this be set to false
? If one full snapshot is missed or more than one missed?
@@ -79,6 +79,8 @@ Status fields related to the etcd cluster itself, such as `Members`, `PeerUrlTLS | |||
- `AllMembersReady`: indicates readiness of all members of the etcd cluster. | |||
- `Ready`: indicates overall readiness of the etcd cluster in serving traffic. | |||
- `BackupReady`: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | |||
- `FullSnapshotBackupReady`: indicates health of the full snapshot backups, i.e whether full snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | |||
- `DeltaSnapshotBackupReady`: indicates health of the delta snapshot backups, i.e whether delta snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the criteria for this value to be false?
@@ -122,7 +122,7 @@ kubectl apply -f config/samples/druid_v1alpha1_etcd.yaml | |||
|
|||
### Verify the Etcd cluster | |||
|
|||
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the cluster size, readiness status of its members, and various other attributes. | |||
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes. | |
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full Snapshot and Delta Snapshot backup status and various other attributes. |
} | ||
|
||
var FullSnapshotBackupReadyCheckResult, DeltaSnapshotBackupReadyCheckResult Result = nil, nil | ||
for _, result := range b.results { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by using the loop variable result
you are shadowing the variable result
defined at the start of Check
method. Please avoid shadowing.
) | ||
fullSnapErr = a.cl.Get(ctx, types.NamespacedName{Name: getFullSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, fullSnapLease) | ||
incrSnapErr = a.cl.Get(ctx, types.NamespacedName{Name: getDeltaSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, deltaSnapLease) | ||
fullSnapErr = f.cl.Get(ctx, types.NamespacedName{Name: getFullSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, fullSnapLease) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is an error getting the full snapshot lease then is the error being ignored immediately but handled much later? If there is an error you should exit soon with error instead of continuing with the computations.
} | ||
} | ||
|
||
func (d *deltaSnapshotBackupReadyCheck) Check(ctx context.Context, etcd druidv1alpha1.Etcd) Result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a LOT of code repetition and this code needs to be rewritten.
@@ -36,7 +36,8 @@ var ( | |||
ConditionChecks = []ConditionCheckFn{ | |||
condition.AllMembersReadyCheck, | |||
condition.ReadyCheck, | |||
condition.BackupReadyCheck, | |||
condition.FullSnapshotBackupReadyCheck, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have removed BackupReadyCheck
here but you have also changed and kept the implementation of BackupReady
in check_backup_ready.go
. Any reason for that?
@@ -90,10 +91,12 @@ func (c *Checker) executeConditionChecks(ctx context.Context, etcd *druidv1alpha | |||
wg.Wait() | |||
})() | |||
|
|||
results := make([]condition.Result, 0, len(ConditionChecks)) | |||
results := make([]condition.Result, 0, len(ConditionChecks)+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this conditions checks + 1?
for r := range resultCh { | ||
results = append(results, r) | ||
} | ||
backupReadyChecker := condition.BackupReadyCheck(c.cl, results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i really do not understand this. You removed it from ConditionChecks
but introduce this here? why?
@anveshreddy18: The following tests failed, say
Full PR test history. Your PR dashboard. Command help for this repository. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
How to categorize this PR?
/area usability monitoring
/kind enhancement cleanup
What this PR does / why we need it:
This PR splits the existing
BackupReady
condition which talks about both full and delta snapshot health as a single condition into two new conditionsFullSnapshotBackupReady
andDeltaSnapshotBackupReady
, one for each type of snapshot i.eFull
andDelta
respectively. This helps in better understanding of status of each backup type thus helping debug issues and analyze more effectively. The existingBackupReady
condition will still be present, which aggregates both the individual snapshot conditions into one.This PR also removes the support for few fields in etcd status which are in deprecated state for enough time. The fields that are removed are
ServiceName
,ClusterSize
,UpdatedReplicas
andLabelSelector
. These fields were essentially taken fromstatefulset
and not necessarily needed to be present underetcd.status
hence they were deprecated and this PR stops the support.This PR partially fixes this issue for streamlining etcd status
Which issue(s) this PR fixes:
Special notes for your reviewer:
Release note: