Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create member, delta/full-snapshot Leases #262

Merged
merged 6 commits into from
Dec 22, 2021

Conversation

timuthy
Copy link
Member

@timuthy timuthy commented Nov 25, 2021

How to categorize this PR?

/area control-plane
/kind enhancement

What this PR does / why we need it:
This PR adds member, full snapshot and delta snapshot Leases which are maintained by etcd-backup-restore.

Which issue(s) this PR fixes:
Fixes part of #158
Fixes part of #278

Special notes for your reviewer:
A "component" concept was introduced in this PR which in the long run can replace the chart that we use and instead build the K8s resources within the code to achieve a better test-ability and more control how the client is used to reconcile the objects (e.g. used verb UPDATE vs PATCH) or how merging objects will be implemented. The same approach was taken in gardener/gardener.

Please see this comment for more information.

Release note:

Etcd-Druid now creates member `Lease` objects which enables the heartbeat functionality for etcd members. Along the way a new flag `--etcd-member-unknown-threshold` was introduced. It determines the duration after which a etcd member's state is considered `unknown` when the member `Lease` is not renewed.

@timuthy timuthy requested a review from a team as a code owner November 25, 2021 07:22
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 25, 2021
@gardener-robot gardener-robot added area/control-plane Control plane related kind/enhancement Enhancement, improvement, extension needs/review Needs review size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py) needs/second-opinion Needs second review by someone else labels Nov 25, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Nov 25, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Nov 25, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 25, 2021
@timuthy
Copy link
Member Author

timuthy commented Nov 25, 2021

/invite @stoyanr

@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 25, 2021
@gardener-robot
Copy link

@stoyanr You have pull request review open invite, please check

@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 30, 2021
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 30, 2021
Copy link
Contributor

@abdasgupta abdasgupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tim for a well-written PR. I just have few minor suggestions.

@@ -201,6 +201,9 @@ type EtcdConfig struct {
// EtcdDefragTimeout defines the timeout duration for etcd defrag call
// +optional
EtcdDefragTimeout *metav1.Duration `json:"etcdDefragTimeout,omitempty"`
// HeartbeatDuration defines the time members send a heartbeat. The default value is 10s.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// HeartbeatDuration defines the time members send a heartbeat. The default value is 10s.
// HeartbeatDuration defines the timeout for members to send heartbeats. The default value is 10s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why timeout? It's actually the duration after which a member renews the member lease.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Then, instead of time we should use duration

@@ -293,6 +293,10 @@ spec:
description: EtcdDefragTimeout defines the timeout duration for
etcd defrag call
type: string
heartbeatDuration:
description: HeartbeatDuration defines the time members send a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above suggestion regarding this

charts/etcd/templates/etcd-statefulset.yaml Show resolved Hide resolved
"k8s.io/apimachinery/pkg/types"
)

type Values struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of struct, should we make Values as map[string]interface{}? If we make it as struct, we need to pass separate structs to the getMapFromEtcd function for separate components in future. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yes, got your point. However, getMapFromEtcd will only be there as long as we have not entirely transitioned all manifests to components. For the transitioning phase it should be fine to do so, wdyt?

The awkward point about map[string]interface{} is the missing type safety for values and that you need to create constants for the keys, so that other components can use them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. Sadly, until getMapFromEtcd is removed completely, parameter list of getMapFromEtcd for each component type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another componentetcd.Values struct which aggregates the different sub values. I hope that's fine.

@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Dec 7, 2021
@timuthy timuthy mentioned this pull request Dec 7, 2021
@timuthy
Copy link
Member Author

timuthy commented Dec 8, 2021

Thanks a lot for your review @abdasgupta. Please see my comments.

Copy link
Contributor

@aaronfern aaronfern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @timuthy for the PR!
I have one suggestion and one minor nitpick

pkg/component/etcd/lease/values_helper.go Outdated Show resolved Hide resolved
pkg/component/etcd/lease/lease_test.go Outdated Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Dec 17, 2021
@gardener-robot gardener-robot removed needs/review Needs review needs/second-opinion Needs second review by someone else labels Dec 17, 2021
@timuthy
Copy link
Member Author

timuthy commented Dec 20, 2021

/ping @abdasgupta

@gardener-robot
Copy link

@abdasgupta

Message

/ping @abdasgupta

@gardener-robot
Copy link

@timuthy You need rebase this pull request with latest master branch. Please check.

@gardener-robot gardener-robot added needs/rebase Needs git rebase needs/second-opinion Needs second review by someone else and removed reviewed/lgtm Has approval for merging labels Dec 21, 2021
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Dec 21, 2021
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Dec 21, 2021
@timuthy
Copy link
Member Author

timuthy commented Dec 21, 2021

Thanks a lot for your review @abdasgupta. I addressed your comments in 735d69c, PTAL.

@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Dec 21, 2021
@abdasgupta
Copy link
Contributor

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/rebase Needs git rebase needs/second-opinion Needs second review by someone else labels Dec 21, 2021
@timuthy timuthy merged commit 0469cc9 into gardener:master Dec 22, 2021
@timuthy timuthy deleted the feature.lease-component branch December 22, 2021 13:15
timuthy added a commit to timuthy/etcd-druid that referenced this pull request Dec 23, 2021
* Add lease component

* Add HeartbeatDuration configuration

* Add label selector for member check

* Use etcd-member-unknown-threshold flag instead of LeaseDurationSeconds

* Address review feedback

* Address review feedback
rgroemmer pushed a commit to stackitcloud/etcd-druid that referenced this pull request Jan 27, 2022
* Add lease component

* Add HeartbeatDuration configuration

* Add label selector for member check

* Use etcd-member-unknown-threshold flag instead of LeaseDurationSeconds

* Address review feedback

* Address review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Control plane related kind/enhancement Enhancement, improvement, extension needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants