Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple backup ready condition for snapshot leases and remove support for deprecated fields in etcd status #820

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 7 additions & 19 deletions api/v1alpha1/etcd.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,12 @@ const (
// +kubebuilder:printcolumn:name="Ready",type=string,JSONPath=`.status.ready`
// +kubebuilder:printcolumn:name="Quorate",type=string,JSONPath=`.status.conditions[?(@.type=="Ready")].status`
// +kubebuilder:printcolumn:name="All Members Ready",type=string,JSONPath=`.status.conditions[?(@.type=="AllMembersReady")].status`
// +kubebuilder:printcolumn:name="Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="BackupReady")].status`
// +kubebuilder:printcolumn:name="Age",type=date,JSONPath=`.metadata.creationTimestamp`
// +kubebuilder:printcolumn:name="Cluster Size",type=integer,JSONPath=`.spec.replicas`,priority=1
// +kubebuilder:printcolumn:name="Current Replicas",type=integer,JSONPath=`.status.currentReplicas`,priority=1
// +kubebuilder:printcolumn:name="Ready Replicas",type=integer,JSONPath=`.status.readyReplicas`,priority=1
// +kubebuilder:printcolumn:name="Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="BackupReady")].status`
// +kubebuilder:printcolumn:name="Full Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="FullSnapshotBackupReady")].status`
// +kubebuilder:printcolumn:name="Delta Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="DeltaSnapshotBackupReady")].status`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call them Delta Backup Ready or Delta Snapshot Backup Ready? Similarly for full snapshot.


// Etcd is the Schema for the etcds API
type Etcd struct {
Expand Down Expand Up @@ -320,6 +321,10 @@ const (
ConditionTypeAllMembersReady ConditionType = "AllMembersReady"
// ConditionTypeBackupReady is a constant for a condition type indicating that the etcd backup is ready.
ConditionTypeBackupReady ConditionType = "BackupReady"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still want BackupReady condition? I thought this would be deprecated and removed in a later version of druid. But i do not see any deprecation notice. What is the plan here?

// ConditionTypeFullSnapshotBackupReady is a constant for a condition type indicating that the full snapshot backup is ready.
ConditionTypeFullSnapshotBackupReady ConditionType = "FullSnapshotBackupReady"
// ConditionTypeDeltaSnapshotBackupReady is a constant for a condition type indicating that the delta snapshot backup is ready.
ConditionTypeDeltaSnapshotBackupReady ConditionType = "DeltaSnapshotBackupReady"
// ConditionTypeDataVolumesReady is a constant for a condition type indicating that the etcd data volumes are ready.
ConditionTypeDataVolumesReady ConditionType = "DataVolumesReady"
)
Expand Down Expand Up @@ -374,10 +379,6 @@ type EtcdStatus struct {
// Conditions represents the latest available observations of an etcd's current state.
// +optional
Conditions []Condition `json:"conditions,omitempty"`
// ServiceName is the name of the etcd service.
// Deprecated: this field will be removed in the future.
// +optional
ServiceName *string `json:"serviceName,omitempty"`
// LastError represents the last occurred error.
// Deprecated: Use LastErrors instead.
// +optional
Expand All @@ -388,10 +389,6 @@ type EtcdStatus struct {
// LastOperation indicates the last operation performed on this resource.
// +optional
LastOperation *LastOperation `json:"lastOperation,omitempty"`
// Cluster size is the current size of the etcd cluster.
// Deprecated: this field will not be populated with any value and will be removed in the future.
// +optional
ClusterSize *int32 `json:"clusterSize,omitempty"`
// CurrentReplicas is the current replica count for the etcd cluster.
// +optional
CurrentReplicas int32 `json:"currentReplicas,omitempty"`
Expand All @@ -404,15 +401,6 @@ type EtcdStatus struct {
// Ready is `true` if all etcd replicas are ready.
// +optional
Ready *bool `json:"ready,omitempty"`
// UpdatedReplicas is the count of updated replicas in the etcd cluster.
// Deprecated: this field will be removed in the future.
// +optional
UpdatedReplicas int32 `json:"updatedReplicas,omitempty"`
// LabelSelector is a label query over pods that should match the replica count.
// It must match the pod template's labels.
// Deprecated: this field will be removed in the future.
// +optional
LabelSelector *metav1.LabelSelector `json:"labelSelector,omitempty"`
// Members represents the members of the etcd cluster
// +optional
Members []EtcdMemberStatus `json:"members,omitempty"`
Expand Down
15 changes: 0 additions & 15 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,9 @@ spec:
- jsonPath: .status.conditions[?(@.type=="AllMembersReady")].status
name: All Members Ready
type: string
- jsonPath: .status.conditions[?(@.type=="BackupReady")].status
name: Backup Ready
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
- jsonPath: .spec.replicas
name: Cluster Size
priority: 1
type: integer
- jsonPath: .status.currentReplicas
name: Current Replicas
priority: 1
Expand All @@ -42,6 +35,15 @@ spec:
name: Ready Replicas
priority: 1
type: integer
- jsonPath: .status.conditions[?(@.type=="BackupReady")].status
name: Backup Ready
type: string
- jsonPath: .status.conditions[?(@.type=="FullSnapshotBackupReady")].status
name: Full Backup Ready
type: string
- jsonPath: .status.conditions[?(@.type=="DeltaSnapshotBackupReady")].status
name: Delta Backup Ready
type: string
name: v1alpha1
schema:
openAPIV3Schema:
Expand Down Expand Up @@ -1785,12 +1787,6 @@ spec:
status:
description: EtcdStatus defines the observed state of Etcd.
properties:
clusterSize:
description: |-
Cluster size is the current size of the etcd cluster.
Deprecated: this field will not be populated with any value and will be removed in the future.
format: int32
type: integer
conditions:
description: Conditions represents the latest available observations
of an etcd's current state.
Expand Down Expand Up @@ -1848,53 +1844,6 @@ spec:
description: Name of the referent
type: string
type: object
labelSelector:
description: |-
LabelSelector is a label query over pods that should match the replica count.
It must match the pod template's labels.
Deprecated: this field will be removed in the future.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
x-kubernetes-map-type: atomic
lastError:
description: |-
LastError represents the last occurred error.
Expand Down Expand Up @@ -2014,17 +1963,6 @@ spec:
description: Replicas is the replica count of the etcd cluster.
format: int32
type: integer
serviceName:
description: |-
ServiceName is the name of the etcd service.
Deprecated: this field will be removed in the future.
type: string
updatedReplicas:
description: |-
UpdatedReplicas is the count of updated replicas in the etcd cluster.
Deprecated: this field will be removed in the future.
format: int32
type: integer
type: object
type: object
served: true
Expand Down
80 changes: 9 additions & 71 deletions config/crd/bases/crd-druid.gardener.cloud_etcds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,9 @@ spec:
- jsonPath: .status.conditions[?(@.type=="AllMembersReady")].status
name: All Members Ready
type: string
- jsonPath: .status.conditions[?(@.type=="BackupReady")].status
name: Backup Ready
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
- jsonPath: .spec.replicas
name: Cluster Size
priority: 1
type: integer
- jsonPath: .status.currentReplicas
name: Current Replicas
priority: 1
Expand All @@ -42,6 +35,15 @@ spec:
name: Ready Replicas
priority: 1
type: integer
- jsonPath: .status.conditions[?(@.type=="BackupReady")].status
name: Backup Ready
type: string
- jsonPath: .status.conditions[?(@.type=="FullSnapshotBackupReady")].status
name: Full Backup Ready
type: string
- jsonPath: .status.conditions[?(@.type=="DeltaSnapshotBackupReady")].status
name: Delta Backup Ready
type: string
name: v1alpha1
schema:
openAPIV3Schema:
Expand Down Expand Up @@ -1785,12 +1787,6 @@ spec:
status:
description: EtcdStatus defines the observed state of Etcd.
properties:
clusterSize:
description: |-
Cluster size is the current size of the etcd cluster.
Deprecated: this field will not be populated with any value and will be removed in the future.
format: int32
type: integer
conditions:
description: Conditions represents the latest available observations
of an etcd's current state.
Expand Down Expand Up @@ -1848,53 +1844,6 @@ spec:
description: Name of the referent
type: string
type: object
labelSelector:
description: |-
LabelSelector is a label query over pods that should match the replica count.
It must match the pod template's labels.
Deprecated: this field will be removed in the future.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
x-kubernetes-map-type: atomic
lastError:
description: |-
LastError represents the last occurred error.
Expand Down Expand Up @@ -2014,17 +1963,6 @@ spec:
description: Replicas is the replica count of the etcd cluster.
format: int32
type: integer
serviceName:
description: |-
ServiceName is the name of the etcd service.
Deprecated: this field will be removed in the future.
type: string
updatedReplicas:
description: |-
UpdatedReplicas is the count of updated replicas in the etcd cluster.
Deprecated: this field will be removed in the future.
format: int32
type: integer
type: object
type: object
served: true
Expand Down
2 changes: 2 additions & 0 deletions docs/concepts/controllers.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ Status fields related to the etcd cluster itself, such as `Members`, `PeerUrlTLS
- `AllMembersReady`: indicates readiness of all members of the etcd cluster.
- `Ready`: indicates overall readiness of the etcd cluster in serving traffic.
- `BackupReady`: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster.
- `FullSnapshotBackupReady`: indicates health of the full snapshot backups, i.e whether full snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which situation will this be set to false? If one full snapshot is missed or more than one missed?

- `DeltaSnapshotBackupReady`: indicates health of the delta snapshot backups, i.e whether delta snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the criteria for this value to be false?

- `DataVolumesReady`: indicates health of the persistent volumes containing the etcd data.

## Compaction Controller
Expand Down
2 changes: 1 addition & 1 deletion docs/development/getting-started-locally.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ kubectl apply -f config/samples/druid_v1alpha1_etcd.yaml

### Verify the Etcd cluster

To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the cluster size, readiness status of its members, and various other attributes.
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes.
To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full Snapshot and Delta Snapshot backup status and various other attributes.


```sh
kubectl get etcd -o=wide
Expand Down
14 changes: 1 addition & 13 deletions docs/proposals/01-multi-node-etcd-clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ This document proposes an approach (along with some alternatives) to support pro
- [Member name as the key](#member-name-as-the-key)
- [Member Leases](#member-leases)
- [Conditions](#conditions)
- [ClusterSize](#clustersize)
- [Alternative](#alternative-4)
- [Decision table for etcd-druid based on the status](#decision-table-for-etcd-druid-based-on-the-status)
- [1. Pink of health](#1-pink-of-health)
Expand Down Expand Up @@ -563,8 +562,6 @@ status:
lastTransitionTime: "2020-11-10T12:48:01Z"
reason: FullBackupSucceeded # FullBackupSucceeded|IncrementalBackupSucceeded|FullBackupFailed|IncrementalBackupFailed
...
clusterSize: 3
...
replicas: 3
...
members:
Expand Down Expand Up @@ -647,15 +644,6 @@ The `BackupReady` condition will be maintained by the leading `etcd-backup-resto

More condition types could be introduced in the future if specific purposes arise.

### ClusterSize

The `clusterSize` field contains the current size of the ETCD cluster. It will be actively kept up-to-date by `etcd-druid` in all scenarios.

- Before [bootstrapping](#bootstrapping) the ETCD cluster (during cluster creation or later bootstrapping because of [quorum failure](#recommended-action-9)), `etcd-druid` will clear the `status.members` array and set `status.clusterSize` to be equal to `spec.replicas`.
- While the ETCD cluster is quorate, `etcd-druid` will actively set `status.clusterSize` to be equal to length of the `status.members` whenever the length of the array changes (say, due to scaling of the ETCD cluster).

Given that `clusterSize` reliably represents the size of the ETCD cluster, it can be used to calculate the `Ready` [condition](#conditions).

### Alternative

The alternative is for `etcd-druid` to maintain the status in the `Etcd` status sub-resource.
Expand Down Expand Up @@ -1255,7 +1243,7 @@ The life-cycle of these work-flows is shown below.

- Take [backups](#backup) (full and incremental) at configured regular intervals
- [Defragment](#defragmentation) all the members sequentially at configured regular intervals
- Cleanup superflous members from the ETCD cluster for which there is no corresponding pod (the [ordinal](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#ordinal-index) in the pod name is greater than the [cluster size](#clustersize)) at regular intervals (or whenever the `Etcd` resource [status](#status) changes by watching it)
- Cleanup superflous members from the ETCD cluster for which there is no corresponding pod (the [ordinal](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#ordinal-index) in the pod name is greater than the cluster size) at regular intervals (or whenever the `Etcd` resource [status](#status) changes by watching it)
- The cleanup of [superfluous entries in `status.members` array](#13-superfluous-member-entries-in-etcd-status) is already covered [here](#recommended-action-12)

## High Availability
Expand Down
Loading