Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug?] rrm_mode other than "t" causes network delay spikes on multiple devices #226

Open
MGunlogson opened this issue Jun 15, 2023 · 2 comments

Comments

@MGunlogson
Copy link

MGunlogson commented Jun 15, 2023

The default is rrm_mode 'pat' . I've found on my network using both p and a causes regular latency spikes. And I've verified this on an intel AX adapter (Windows) and M1 Mac. Running newest drivers and OS versions.

rrm_mode = 'PAT'

PAT

rrm_mode = 'T'

T

Suggestion - make 'T' default instead of 'PAT'

I noticed that Aruba Networks devices default to beacon table https://www.arubanetworks.com/techdocs/ArubaOS_64x_WebHelp/Content/ArubaFrameStyles/VirtualAPs/Radio_Resource_Management_(802.11k).htm

My guess is that both Passive and Active mode cause client devices to scan band, which momentarily interrupts connection. Since "table" mode just sends back currently seen beacons without further scanning, it doesn't appear to cause this problem.

@MGunlogson MGunlogson changed the title BUG? rrm_mode other than "t" causes network delay spikes on multiple devices [BUG?] rrm_mode other than "t" causes network delay spikes on multiple devices Jun 15, 2023
@MGunlogson MGunlogson changed the title [BUG?] rrm_mode other than "t" causes network delay spikes on multiple devices [Bug?] rrm_mode other than "t" causes network delay spikes on multiple devices Jun 15, 2023
@MGunlogson
Copy link
Author

MGunlogson commented Jun 18, 2023

I've done more investigation on this.

I think there should be a different update interval setting for update_beacon_reports depending on probe type (active vs passive vs table) . Reasons:

  • Table beacons cause no jitter on any of my devices. It seems safe to probe this as often as you want
  • Active beacons cause ~60ms jitter on several of my devices. This may be acceptable if you only do it every few minutes
  • Passive beacons stop packet transmission the longest, ~120ms . And the results seem worse than active anyways. So maybe it's a good idea to default this to off? Even if you can have different intervals for each beacon type, I don't see an advantage to using Passive at all

This page seems to confirm my suspicion that Active probes just work better than Passive ones.

Passive scan—Passive scanning is performed by simply changing the clients IEEE 802.11 radio to the channel being scanned and waiting for a periodic beacon from any APs on that channel. By default, APs send beacons every 100 ms. Because it may take 100 ms to hear a periodic beacon broadcast, most clients prefer an active scan. During a channel scan, the client is unable to transmit or receive client data traffic.

Beacon storms?

I saw this postulated in various threads on OpenWrt forums. And I think it does happen. Active beacons are faster than Passive scan (because client doesn't have to wait for AP beacon frames). I think the ~60ms packet delay I'm seeing with Active beacons is from every device probing at once. It's possible that staggering active probes across clients could reduce the jitter to point its not noticeable

I'm no C programmer, but I can make an attempt at a PR if you're interested. For:

  • Defaulting Passive beacons to off. Set default rrm_mode 'at'
  • Splitting update_beacon_reports into three different interval settings. One for each mode (Passive, Active, Table). Leave the interval for Table beacons short, but significantly increase the one for Passive and Active scans
  • Add a small delay between Active probe requests for each client (maybe 50ms?) to prevent network contention happening when a bunch of clients try to reply at the same time

@nzchats
Copy link

nzchats commented Jun 27, 2023

I had a similar issue and setting rrm_mode to just 't' as suggested by @MGunlogson resolved it.

In my case, issues showed up in a lubuntu box that uses a TP-Link USB wifi adapter connecting to a TP-Link Archer C7 Openwrt AP as primary and getting kicked to a backup Linksys WRT Openwrt AP if required. Was getting very bad wifi throughput since enabling 802.11v/k managed by DAWN on Openwrt 22.03.2 on the SSID - specially when downloading large files it would just fail randomly and generally network felt sluggish/unstable. Even before there were random issues as v/k were enabled on a different SSID but didn't really investigate as this machine is used sparingly and assumed it was just the usb wifi adapter not playing nice with linux.

Doing a icmp ping to the AP showed that every 18-19th sequence would spike to over 100ms from ~1ms. Doing TCP dumps unfortunately didn't give much insight as it didn't have anything very obvious that I could interpret. Also noticed that on the linux desktop wifi icon when this happens, it was essentially going full bars before resetting back to APs signal strength.

Assumed it might be a linux issue, doing background scans or something and attempted to track down ways to disable as such without much success including setting the wifi manager to use specific bssid - set to just the primary AP - which is suppose to stop scans. This didn't resolve it.

Then decided to take a side by side view of the ping and running top on the TPLink router. Each time icmp ping spiked, dawn was top of the process list doing something. This seems to correlate with the "update_beacon_reports" that's set to 20s.

Quick google and came across this post. Changed the rrm mode to 't' as suggested and just like that all the icmp pings are normal & connectivity no longer sluggish. DAWN functions like band steering still working great. Will update if anything else shows up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants