-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: system tests: make random_free_port() parallel-safe #23488
Comments
It's a Bats bug: parallelization doesn't work the way I expected it to (not a bug) and there's no documentation about it nor any way to get a job slot number (yes a bug). Filed bats-core/bats-core#968 |
New mechanism submitted: 7c3d294 Tested on my laptop, because CI is down. No failures seen yet. |
I am somewhat sure this fixes #23471 as well. |
It absolutely does not. I see pasta timeouts on my laptop even with this port reservation approach. |
Before this it failed basically every time for me, now it no longer fails after 10+ runs locally All I see in from current PR is
which seems to cause exit code 1 even though bats prints |
Are you on f40 and have you dnf-upgraded? This [cp,slashes] is fixed in bats-11 |
Took a little longer than I expected (i.e. more than two runs), but: $ while :;do ./bats -T --rootless --tag='ci:parallel' 505 || break;done
...
✗ |505| TCP port range forwarding, IPv4, tap [131778]
tags: ci:parallel
(from function `bail-now' in file test/system/helpers.bash, line 189,
from function `die' in file test/system/helpers.bash, line 937,
from function `run_podman' in file test/system/helpers.bash, line 539,
from function `pasta_test_do' in file test/system/505-networking-pasta.bats, line 235,
in test file test/system/505-networking-pasta.bats, line 483)
`pasta_test_do' failed
[06:08:58.901317437] $ /home/esm/src/atomic/2018-02.podman/libpod/bin/podman info --format {{.Host.Pasta.Executable}}
[06:09:00.064937823] /usr/bin/pasta
[06:09:00.318915429] $ /home/esm/src/atomic/2018-02.podman/libpod/bin/podman run --rm --name=c-socat-t21-6gost7x6 --net=pasta -p [192.168.101.31]:5730-5732:5
730-5732/tcp quay.io/libpod/testimage:20240123 sh -c for port in $(seq 5730 5732); do socat -u TCP4-LISTEN:${port},bind=[192.168.10
1.31] STDOUT & done; wait
[06:11:10.344876219] timeout: sending signal TERM to command ‘/home/esm/src/atomic/2018-02.podman/libpod/bin/podman’
timeout: sending signal KILL to command ‘/home/esm/src/atomic/2018-02.podman/libpod/bin/podman’
[06:11:10.348823337] [ rc=137 (** EXPECTED 0 **) ]
#/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#| FAIL: exit code is 137; expected 0
#\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# [teardown] |
I am on f39 and there is no bats update there. Anyhow I just renamed the test for now and running this in a loop for 15 minutes without failure. Before that it failed on almost every run so I am very certain that the the port conflict caused the hangs in the udp case due the incorrect behavior of REUSEADDR. #23471 (comment) I now also got the TCP hang after 15 minutes but this seems different from the udp hangs listed in #23471 I think. |
Oh, sorry, I've just been treating all timeouts as the same bug, not sorting by tcp/udp. The slash bug is harmless; you can disable the
|
I thought I had, but guess not. This is a placeholder issue until I get it right.
The text was updated successfully, but these errors were encountered: