Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Connection Issues Across Different Regions #3016

Open
ginsul opened this issue Dec 10, 2024 · 3 comments
Open

Intermittent Connection Issues Across Different Regions #3016

ginsul opened this issue Dec 10, 2024 · 3 comments

Comments

@ginsul
Copy link

ginsul commented Dec 10, 2024

Describe the problem

I had a coordinator server on AWS Virginia and an exit node in Singapore.

When I tried to connect using Windows, I randomly encountered the following conditions:

  • Sometimes, there was always a request timeout to the internet.
  • Sometimes, I got a ping of >400ms (even though my public IP showed Singapore, the latency should have been only ~10ms).
  • Sometimes, I got a ping of 10ms, with my public IP correctly showing Singapore.
  • When I had a 10ms ping and tried to disconnect and reconnect, the connection failed and seemed to hang

I have already searched through the issues for a solution but still have no clue.
I captured logs using netbird -A debug for 2 minutes and have attached them.
I have already opened all inbound traffic in the AWS security group.

netbird.debug.1914716158.zip

Expected behavior

Got a Singapore IP address with a 10ms ping, and the connection no longer hangs when disconnecting and reconnecting

NetBird version

netbird selfhosted 0.34.1

NetBird status -dA output:

Peers detail:
 ip-172-31-5-27.netbird.selfhosted:
  NetBird IP: 100.127.129.240
  Public key: mP047JXbCIyJ+qvk53Fd9HP2sDpRNlhxbXij4AQYQhI=
  Status: Connected
  -- detail --
  Connection type: P2P
  ICE candidate (Local/Remote): host/prflx
  ICE candidate endpoints (Local/Remote): 172.31.12.91:51820/172.31.5.27:51820
  Relay server address: rel://terbang.anon-5Ev56.domain:33080
  Last connection update: 12 minutes, 38 seconds ago
  Last WireGuard handshake: 2 minutes, 19 seconds ago
  Transfer status (received/sent) 940 B/1.5 KiB
  Quantum resistance: false
  Routes: -
  Latency: 380.987µs

 abcd.netbird.selfhosted:
  NetBird IP: 100.127.226.41
  Public key: 83ef9I0e6R6VtuReCzhSbzJ/UPZAHRle1k0T2kriZQA=
  Status: Disconnected
  -- detail --
  Connection type:
  ICE candidate (Local/Remote): -/-
  ICE candidate endpoints (Local/Remote): -/-
  Relay server address:
  Last connection update: 5 minutes, 56 seconds ago
  Last WireGuard handshake: -
  Transfer status (received/sent) 0 B/0 B
  Quantum resistance: false
  Routes: -
  Latency: 12.867329ms

OS: linux/amd64
Daemon version: 0.34.1
CLI version: 0.34.1
Management: Connected to https://terbang.anon-5Ev56.domain:33073
Signal: Connected to http://terbang.anon-5Ev56.domain:10000
Relays:
  [stun:terbang.anon-5Ev56.domain:3478] is Available
  [turn:terbang.anon-5Ev56.domain:3478?transport=udp] is Available
  [rel://terbang.anon-5Ev56.domain:33080] is Available
Nameservers:
FQDN: ip-172-31-12-91.netbird.selfhosted
NetBird IP: 100.127.83.39/16
Interface type: Kernel
Quantum resistance: false
Routes: 0.0.0.0/0
Peers count: 1/2 Connected

Screenshots

image

Additional context

Any insights or suggestions would be greatly appreciated. Thank you.

@ginsul
Copy link
Author

ginsul commented Dec 10, 2024

Hi, just as a temporary solution, this setup seems to be working for now.
I'm still not sure why the latest version 0.34.1 (and other newer versions from 0.30 to 0.34, which I already tested) still have issues.

Current working setup:

Coordinator Server: 0.34.1
Node Server: 0.29.4
Client (Windows): 0.29.4

@ginsul
Copy link
Author

ginsul commented Dec 11, 2024

Hi,
The intermittent issue disappeared with version 0.29.4, but latency remains poor.
I suspect this is because the client is using the Coordinator Server as an Exit Node instead of the actual peer chosen as the Exit Node.

How can we enforce that the client does not use the Coordinator Server as the Exit Node?

@rihards-simanovics
Copy link

rihards-simanovics commented Dec 15, 2024

Hey @ginsul I'm not curtain whether my issue #3042 is similar to yours, but I found that on my network of ~15-18 peers (mix of Ubuntu Linux and couple of windows peers) all of a sudden whenever Windows 11 peer wants to connect to another Linux server with a static public IP almost all peers are Relayed instead of creating P2P connection. This doesn't make sense to me, especially since it used to work fine before. From the new Relay docs it sounds as though the Relayed connection should only exist until a P2P is established but in my case it doesn't happen at all once the network "settles" after all peers are connected. I can only observe this on Windows 11 and MacOS connecting to servers but not between Linux server peers themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants