Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not remove upstream DNS configuraton in cleanup #1280

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mkowalski
Copy link
Member

@mkowalski mkowalski commented Jul 24, 2021

This PR changes the logic of host cleanup so that the upstream DNS
configuration is not removed. This comes from the fact that by default
dev-scripts configure NetworkManager to use dnsmasq with DHCP-provided
DNS servers and removes the existing /etc/resolv.conf configuration.

This causes problems in environments configured without DHCP with DNS
servers configured manually without use of the NetworkManager. With this
PR user can use ADDN_DNS variable to provide their own DNS servers
that will be passed to the dnsmasq. At the same time when removing
dev-scripts from the system, the backup of the original /etc/resolv.conf
will be restored so that the initial system configuration is reverted.

@openshift-ci openshift-ci bot requested review from celebdor and flaper87 July 24, 2021 11:54
@mkowalski
Copy link
Member Author

/assign @hardys

@flaper87
Copy link
Contributor

/retest

Copy link
Contributor

@flaper87 flaper87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

I will leave the final approval to @hardys since you guys debugged this together!

@openshift-ci
Copy link

openshift-ci bot commented Jul 26, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flaper87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 26, 2021
@hardys
Copy link

hardys commented Jul 27, 2021

The only problem with doing this unconditionally is some folks expect the dev-scripts changes to be reverted on make clean ref #1168

I wonder if we can somehow detect the resolv.conf isn't NM generated and save it somewhere to restore during host_cleanup.sh? (perhaps looking for presence of Generated by NetworkManager ?)

@mkowalski
Copy link
Member Author

The only problem with doing this unconditionally is some folks expect the dev-scripts changes to be reverted

Yes, this is the most reasonable assumption, but it is not the case already now - i.e. running make and then make clean leaves the host in a broken state if there is no DHCP server in the network.

Honestly, I'm not sure if there is a silver bullet here...

I wonder if we can somehow detect the resolv.conf isn't NM generated and save it somewhere to restore during host_cleanup.sh

Yes, if /etc/resolv.conf is NM-generated then it's marked as such. But if we go down this way, we will face the next question - "whether we have configured NM in such a way or was it the user before". Overall I'm trying to think bigger here - how do people use dev-scripts ? In it's current form I already have a dedicated machine that is not used for any other setups, so I kinda accept the fact that dev-scripts may make changes to the system that are difficult (if not impossible) to revert. Are there even people running dev-scripts on their day-to-day workstations ?

@hardys
Copy link

hardys commented Jul 30, 2021

The only problem with doing this unconditionally is some folks expect the dev-scripts changes to be reverted

Yes, this is the most reasonable assumption, but it is not the case already now - i.e. running make and then make clean leaves the host in a broken state if there is no DHCP server in the network.

Right, we're in agreement this is something which needs to be fixed, it's just a question of how :)

Honestly, I'm not sure if there is a silver bullet here...

I wonder if we can somehow detect the resolv.conf isn't NM generated and save it somewhere to restore during host_cleanup.sh

Yes, if /etc/resolv.conf is NM-generated then it's marked as such. But if we go down this way, we will face the next question - "whether we have configured NM in such a way or was it the user before". Overall I'm trying to think bigger here - how do people use dev-scripts ? In it's current form I already have a dedicated machine that is not used for any other setups, so I kinda accept the fact that dev-scripts may make changes to the system that are difficult (if not impossible) to revert. Are there even people running dev-scripts on their day-to-day workstations ?

I think the majority of folks do have a dedicated machine, but clearly @dustinblack had a reason for pushing his cleanup PR, so I'm trying to find a way to solve your problem without reintroducing his ;)

I don't think any of these changes are impossible to revert, we just need to backup and restore the resolv.conf, right?

That could even be done unconditionally, since in the NM-managed case it'll just get overwritten, but in the static-config case we'll restore the previous config?

@hardys
Copy link

hardys commented Jul 30, 2021

Another point to note is that even when folks have a dedicated machine, sometimes they will want to switch between dev-scripts testing and other setups (upstream metal3-dev-env, maybe assisted-test-infra etc)

@flaper87
Copy link
Contributor

@mkowalski I think there is value in revisiting this PR, if you have the bandwidth.

@mkowalski
Copy link
Member Author

A simple change now should be enough to handle multiple scenarios with as simple logic as possible. I tried to avoid using specific markers in the /etc/resolv.conf because I'm already aware of 2 possibilities (and there are probably many more)

  • # Generated by NetworkManager
  • # This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).

@mkowalski
Copy link
Member Author

/retest

None of them looks like failing because of this change

@mkowalski
Copy link
Member Author

/retest

@mkowalski
Copy link
Member Author

/test e2e-metal-ipi-serial-ovn-ipv6

This PR changes the logic of host cleanup so that the upstream DNS
configuration is not removed. This comes from the fact that by default
dev-scripts configure NetworkManager to use dnsmasq with DHCP-provided
DNS servers and removes the existing /etc/resolv.conf configuration.

This causes problems in environments configured without DHCP with DNS
servers configured manually without use of the NetworkManager. With this
PR user can use `ADDN_DNS` variable to provide their own DNS servers
that will be passed to the dnsmasq. At the same time when removing
dev-scripts from the system, the backup of the original /etc/resolv.conf
will be restored so that the initial system configuration is reverted.
@openshift-ci
Copy link

openshift-ci bot commented Dec 6, 2022

@mkowalski: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-proxy-ipv4 262a631 link true /test e2e-metal-ipi-proxy-ipv4
ci/prow/e2e-metal-ipi-proxy-ipv6 262a631 link true /test e2e-metal-ipi-proxy-ipv6
ci/prow/e2e-metal-ipi 262a631 link true /test e2e-metal-ipi
ci/prow/e2e-metal-ipi-bm-bond 71929ca link false /test e2e-metal-ipi-bm-bond
ci/prow/e2e-metal-ipi-serial-ipv4 71929ca link true /test e2e-metal-ipi-serial-ipv4

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@bfournie
Copy link
Contributor

bfournie commented Jan 4, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants