Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect IP when using dns-resolver #904

Open
shumvgolove opened this issue Nov 19, 2024 · 1 comment
Open

Incorrect IP when using dns-resolver #904

shumvgolove opened this issue Nov 19, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@shumvgolove
Copy link

shumvgolove commented Nov 19, 2024

Describe the bug

Gatus doesn't show the correct IP in conditions when dns-resolver overshadows existing domain name, although healthcheck is performed on the correct one.

What do you see?

Gatus returns incorrect IP:

✓ ~ [STATUS] == 200
X ~ [IP] (142.250.200.110) == any(:1, 127.0.0.1)

What do you expect to see?

Gatus should return the correct IP when using dns-resolver:

✓ ~ [STATUS] == 200
✓ ~ [IP] (127.0.0.1) == any(:1, 127.0.0.1)

List the steps that must be taken to reproduce this issue

  1. Create a dns server that overshadows domain google.com with custom IPs (for example, using CoreDNS):

    Corefile

    .:54 {
        bind lo
        hosts {
            127.0.0.1 google.com
            ::1 google.com
            fallthrough
        }
        log
    }
    

    drill -p 54 A google.com

    ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 22162
    ;; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 
    ;; QUESTION SECTION:
    ;; google.com.	IN	A
    
    ;; ANSWER SECTION:
    google.com.	3600	IN	A	127.0.0.1
    
    ;; AUTHORITY SECTION:
    
    ;; ADDITIONAL SECTION:
    
    ;; Query time: 0 msec
    ;; SERVER: ::1
    ;; WHEN: Tue Nov 19 09:57:36 2024
    ;; MSG SIZE  rcvd: 44
    

    drill -p 54 AAAA google.com

    ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 7552
    ;; flags: qr rd ra ; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0 
    ;; QUESTION SECTION:
    ;; google.com.	IN	AAAA
    
    ;; ANSWER SECTION:
    google.com.	3600	IN	AAAA	::1
    
    ;; AUTHORITY SECTION:
    
    ;; ADDITIONAL SECTION:
    
    ;; Query time: 187 msec
    ;; SERVER: 127.0.0.1
    ;; WHEN: Tue Nov 19 10:01:17 2024
    ;; MSG SIZE  rcvd: 56
    
  2. Create the following healthcheck:

    endpoints:
      - name: test
        url: "https://google.com"
        client:
          dns-resolver: "tcp://127.0.0.1:54"
        interval: 30s
        conditions:
          - "[STATUS] == 200"
          - "[IP] == any(::1, 127.0.0.1)"
  3. Observe that healthcheck fails with the incorrect IP:

    ✓ ~ [STATUS] == 200
    X ~ [IP] (142.250.200.110) == any(:1, 127.0.0.1)
    

    142.250.200.110 here is the actual Google IP, resolved from global system DNS.

  4. Observe that Gatus process connects correctly to 127.0.0.1 (5848 is a main gatus PID) :

    strace -f -e trace=network -s 10000 -p 5848 2>&1 | grep 'connect' | grep '443'

    [pid  5848] connect(7, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation in progress)
    [pid  5824] connect(7, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation in progress)
    ...
    

Version

v5.13.1

Additional information

No response

@shumvgolove shumvgolove added the bug Something isn't working label Nov 19, 2024
@TwiN
Copy link
Owner

TwiN commented Nov 20, 2024

Hmm.. This is because the IP from the [IP] placeholder is only retrieved if the placeholder is present in one of the conditions, and when it is present, it retrieves it using net.LookupIP, which completely bypasses the client.dns-resolver configuration.

if e.needsToRetrieveIP() {
e.getIP(result)
}

func (e *Endpoint) getIP(result *Result) {
if ips, err := net.LookupIP(result.Hostname); err != nil {
result.AddError(err.Error())
return
} else {
result.IP = ips[0].String()
}
}

From a UX perspective, I completely understand why you'd expect client.dns-resolver to be used for the DNS lookups though, so you bring a good point.

It shouldn't be too difficult to implement, given that the code for the resolver already exists, and that under the hood, net.LookupIP makes a call to DefaultResolver.LookupIPAddr.

gatus/client/config.go

Lines 240 to 260 in 0113175

if c.HasCustomDNSResolver() {
dnsResolver, err := c.parseDNSResolver()
if err != nil {
// We're ignoring the error, because it should have been validated on startup ValidateAndSetDefaults.
// It shouldn't happen, but if it does, we'll log it... Better safe than sorry ;)
logr.Errorf("[client.getHTTPClient] THIS SHOULD NOT HAPPEN. Silently ignoring invalid DNS resolver due to error: %s", err.Error())
} else {
dialer := &net.Dialer{
Resolver: &net.Resolver{
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
d := net.Dialer{}
return d.DialContext(ctx, dnsResolver.Protocol, dnsResolver.Host+":"+dnsResolver.Port)
},
},
}
c.httpClient.Transport.(*http.Transport).DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) {
return dialer.DialContext(ctx, network, addr)
}
}
}

We'd have to extract the piece of code that creates the resolver, and then we can reuse it to create a dialer that would work for both the HTTP client and the function used for resolving the IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants