nftables: internal:0:0-0: Error: Could not process rule: Device or resource busy #23404

edsantiago · 2024-07-25T17:01:06Z

Very weird one-off:

→ Enter [AfterEach] TOP-LEVEL - /var/tmp/go/src/github.com[/containers/podman/test/e2e/common_test.go:120](https://github.com/containers/podman/blob/4831981cd98b83a54f8b71e795794b3eb36aaa9f/test/e2e/common_test.go#L120) @ 07/24/24 22:18:04.589
           # podman [options] stop --all -t 0
           internal:0:0-0: Error: Could not process rule: Device or resource busy
         
           time="2024-07-24T22:18:04Z" level=error msg="Unable to clean up network for container e654e08b54b76a7b7385a5a5fb7114d7abf252e312f5426845c2cd1b22d7a29b: \"netavark: nftables error: nft did not return successfully while applying ruleset\""

Seen twice in one f40 root run.

What's weird about it:

I've never seen this one before.
It failed on all ginkgo retries. That is: this isn't my pet no-retry PR, this was a real PR. Normally these kinds of failures are masked by the evil ginkgo flake retry.
failed on two tests. Timestamps (548s, 551s) suggest that both tests may have been parallel-running at close to the same time

It is possible that this has been happening all along, but ginkgo-retry has been hiding it. We have no sane way to find out, aside from downloading and grepping all logs for all CI runs. Or, as I will suggest in a future Cabal, disabling flake retries.

The text was updated successfully, but these errors were encountered:

edsantiago · 2024-07-25T17:02:02Z

(If this is a netavark bug, could you please copy it there instead of moving? My flake log needs a podman issue number. Thanks.)

Luap99 · 2024-07-25T17:11:55Z

This sounds like the error we are seeing https://bugzilla.redhat.com/show_bug.cgi?id=2013173 but I haven't yet looked if this is something that netavark causes or if there is some other cause.

cc @mheon

Luap99 · 2024-07-26T16:12:10Z

So if I read https://wiki.nftables.org/wiki-nftables/index.php/Configuring_chains correctly the EBUSY error just means the chain is not empty when we try to remove it.
In theory we delete all rules from the chain before we remove the chain but maybe there is some chance that we missed a rule somehow?

mheon · 2024-07-26T21:13:39Z

There is also the potential of a race against something else adding rules, though that something can't be Netavark because of locking.

Luap99 · 2024-07-29T07:47:46Z

Well it should be safe to assume that on a CI VM nothing besides netavark would mess with our nftables chains... So if locking works then we have some bug where rules are not deleted properly

karuboniru · 2024-08-23T13:46:31Z

Might be related to this: containers/netavark#1068

$ cat /etc/containers/containers.conf
[containers]
userns = "auto"

[network]
firewall_driver = "nftables"

$ sudo podman system reset -f 
...

$ sudo podman run -it --rm -p 10.0.1.10:10222:10222/udp -p 10.0.1.10:10222:10222/tcp alpine:latest sh
/ # // left it running

// In another Terminal
$ sudo podman network reload --all
internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

ERRO[0000] netavark: nftables error: nft did not return successfully while applying ruleset 
c652403742bc95392bfea0da8e7d37cff1d057c2b744fb22a393b39daf07498d

This can stabily reproduce the issue on my system (Fedora), doing sudo nft delete table inet netavark before reloading can workaround such error.

For Unable to clean up network for container, I believe I have seen such log when restart such containers with the firewall issue, don't know if related

Luap99 · 2024-08-23T16:58:15Z

This can stabily reproduce the issue on my system (Fedora), doing sudo nft delete table inet netavark before reloading can workaround such error.

I don't think it is related to this issue. This is about a specific CI flake and the error is EBUSY not ENOENT like in your case. If it only triggers with the specific port forwarding setup from containers/netavark#1068 then this is most likely the cause for your problem. I take a look next week.

Luap99 · 2024-11-01T16:00:18Z

I am going to close this one as I don't think we saw since then, we can reopen if we see it again.

siegy22 · 2024-11-10T22:10:31Z

I am running into this issue on a fresh Fedora 41 installation (on WSL).

Fedora 41
podman-5.2.5-1.fc41.x86_64
netavark-1.13.0-1.fc41.x86_64

root@Wintrash:~# podman run --rm -it docker.io/library/busybox:latest
Trying to pull docker.io/library/busybox:latest...
Getting image source signatures
Copying blob a46fbb00284b done   |
Copying config 27a71e19c9 done   |
Writing manifest to image destination

internal:0:0-0: Error: Could not process rule: No such file or directory
internal:0:0-0: Error: Could not process rule: No such file or directory
Error: netavark: nftables error: nft did not return successfully while applying ruleset

Any ideas what to look for? It's not a race condition in my case, I can't run any container using podman.

Luap99 · 2024-11-11T09:54:12Z

File a new issue on netavark, try running the command with strace -f -s 300 and provide the output there. That should hopefully tell us where it fails.

edsantiago added the flakes Flakes from Continuous Integration label Jul 25, 2024

Luap99 added the network Networking related issue or feature label Jul 25, 2024

Luap99 closed this as not planned Won't fix, can't repro, duplicate, stale Nov 1, 2024

jorgeml mentioned this issue Nov 19, 2024

"No route to host" after restarting container containers/netavark#1129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nftables: internal:0:0-0: Error: Could not process rule: Device or resource busy #23404

nftables: internal:0:0-0: Error: Could not process rule: Device or resource busy #23404

edsantiago commented Jul 25, 2024

edsantiago commented Jul 25, 2024

Luap99 commented Jul 25, 2024

Luap99 commented Jul 26, 2024

mheon commented Jul 26, 2024

Luap99 commented Jul 29, 2024

karuboniru commented Aug 23, 2024

Luap99 commented Aug 23, 2024

Luap99 commented Nov 1, 2024

siegy22 commented Nov 10, 2024

Luap99 commented Nov 11, 2024

nftables: internal:0:0-0: Error: Could not process rule: Device or resource busy #23404

nftables: internal:0:0-0: Error: Could not process rule: Device or resource busy #23404

Comments

edsantiago commented Jul 25, 2024

edsantiago commented Jul 25, 2024

Luap99 commented Jul 25, 2024

Luap99 commented Jul 26, 2024

mheon commented Jul 26, 2024

Luap99 commented Jul 29, 2024

karuboniru commented Aug 23, 2024

Luap99 commented Aug 23, 2024

Luap99 commented Nov 1, 2024

siegy22 commented Nov 10, 2024

Luap99 commented Nov 11, 2024