Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netbird should connect to peers before setting up DNS #2002

Open
Thunderbottom opened this issue May 16, 2024 · 2 comments · May be fixed by #2291
Open

Netbird should connect to peers before setting up DNS #2002

Thunderbottom opened this issue May 16, 2024 · 2 comments · May be fixed by #2291
Labels
bug Something isn't working client dns

Comments

@Thunderbottom
Copy link
Contributor

Thunderbottom commented May 16, 2024

Describe the problem

In the latest version, netbird tries to resolve DNS before connecting to the peers. This causes the DNS resolution to fail in case where the DNS being used is a private DNS behind a routing peer. This further causes netbird to wait for the DNS resolution to timeout before connecting to peers on the network, and hence, it takes at least 15 seconds in our case to connect to the first peer.

After the peer connects, the DNS resolution works perfectly fine. But this delay in most cases is unbearable and causes usability issues for a lot of people.

Logs:

May 16 22:31:13 hades netbird[1405]: 2024-05-16T22:31:13+05:30 INFO signal/client/grpc.go:158: connected to the Signal Service stream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [error: read udp 192.168.69.100:59667->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [error: read udp 192.168.69.100:41074->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [error: read udp 192.168.69.100:37390->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN client/internal/dns/upstream.go:265: Upstream resolving is Disabled for 30s
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 INFO [nameservers: [{192.168.0.2 udp 53}]] client/internal/dns/server.go:504: Temporarily deactivating nameservers group due to timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [error: read udp 192.168.69.100:59543->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [error: read udp 192.168.69.100:56609->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 WARN [upstream: 192.168.0.2:53, error: read udp 192.168.69.100:45333->192.168.0.2:53: i/o timeout] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:15 hades netbird[1405]: 2024-05-16T22:31:15+05:30 INFO client/internal/dns/resolvconf_linux.go:73: added 2 search domains. Search list: [local.netbird lan]
May 16 22:31:16 hades netbird[1405]: 2024-05-16T22:31:16+05:30 WARN [error: read udp 192.168.69.100:46766->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:16 hades netbird[1405]: 2024-05-16T22:31:16+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:16 hades netbird[1405]: 2024-05-16T22:31:16+05:30 WARN [error: read udp 192.168.69.100:57562->192.168.0.2:53: i/o timeout, upstream: 192.168.0.2:53] client/internal/dns/upstream.go:102: got an error while connecting to upstream
May 16 22:31:16 hades netbird[1405]: 2024-05-16T22:31:16+05:30 ERRO client/internal/dns/upstream.go:134: all queries to the upstream nameservers failed with timeout
May 16 22:31:17 hades netbird[1405]: 2024-05-16T22:31:17+05:30 INFO management/client/grpc.go:147: connected to the Management Service stream
May 16 22:31:17 hades netbird[1405]: 2024-05-16T22:31:17+05:30 WARN client/internal/routemanager/client.go:154: the network 192.168.0.0/19 has not been assigned a routing peer as no peers from the list [<LIST>] are currently connected
May 16 22:31:18 hades netbird[1405]: 2024-05-16T22:31:18+05:30 INFO client/internal/routemanager/client.go:165: new chosen route is <ROUTE> with peer <PEER-ID> with score 2.974409 for network 192.168.0.2/32
May 16 22:31:18 hades netbird[1405]: 2024-05-16T22:31:18+05:30 INFO client/internal/dns/upstream.go:241: upstreams [192.168.0.2:53] are responsive again. Adding them back to system

In the logs it took a few seconds to connect, but usually on netbird up, this takes at least 10-15 seconds to connect.

To Reproduce

Steps to reproduce the behavior:

  1. Set up a private DNS on netbird using routing peer.
  2. Connect to netbird.
  3. Notice that netbird tries to resolve DNS and fails before trying to connect to the peers.
  4. See error.

Expected behavior

The DNS resolution should take place after the peer connections are initialized. There's no need for netbird to replace and resolve DNS before connecting to peers.

Are you using NetBird Cloud?

Self-hosted NetBird's control plane.

NetBird version

netbird version: 0.27.7

@mlsmaycon
Copy link
Collaborator

Thanks for opening this bug @Thunderbottom , the current behavior is the following:

We configure DNS and test it right the way, it should fail faster for this initial test. Decreasing timeout will help but not setting it up is the best approach.

@mlsmaycon mlsmaycon added bug Something isn't working client dns and removed triage-needed labels May 17, 2024
@hurricanehrndz hurricanehrndz linked a pull request Jul 19, 2024 that will close this issue
6 tasks
@LeszekBlazewski
Copy link

Looking forward for the fix for this as well.

My use case is that I have a inbound route53 private hosted zone resolver which sits within a private VPC subnet and access to it is allowed from the range of IP addresses in which a netbird routing peer runs (so without sending the traffic via the peer, the DNS won't resolve)

As mentioned in the issue, the current behaviour on latest version (0.28.7) is that it works sometimes and it doesn't in other cases (happens randomly for both linux and macos).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working client dns
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants