Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix new github actions e2e failure #890

Merged
merged 2 commits into from
Jun 13, 2024
Merged

Conversation

thockin
Copy link
Member

@thockin thockin commented Jun 13, 2024

Tests worked 4 days ago, no committed changes since then. Now fails with:

testcase webhook_fail_retry: FAIL

docker: Error response from daemon: invalid config for network bridge: invalid endpoint settings:
user specified IP address is supported on user defined networks only.
See 'docker run --help'.

We have exactly one testcase that needs this (docker run --ip). It used to work and now it doesn't. It looks like maybe --ip NEVER worked but just didn't error before.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from nan-yu and stp-ip June 13, 2024 05:00
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 13, 2024
@thockin thockin changed the title trigger a testrun Fix new github actions e2e failure Jun 13, 2024
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 13, 2024
github actions fails with an error about "--ip can only be used on
user-defined subnets".  It looks like `--ip` never worked properly, but
wasn't a hard error before.

This is a simpler alternative to
11f4752 (included below), which tried
using docker networks.  It seems to work but is complicated and can leak
resources.  Needs more work.

Instead, this commit just swaps out the `nc` response script
on the fly, rather than restarting `nc` and trying to get the same IP.

```diff
commit 11f4752
Good "git" signature for [email protected] with ED25519 key SHA256:PfQ0rwNUgsu5aRmerT0vkihWn/S3MXY3uoCPUiMdPrg
Author: Tim Hockin <[email protected]>
Date:   Wed Jun 12 20:12:54 2024 -0700

    debug test fail

    github actions fails with an error about "--ip can only be used on
    user-defined subnets"

diff --git a/test_e2e.sh b/test_e2e.sh
index d6ad730..b10e895 100755
--- a/test_e2e.sh
+++ b/test_e2e.sh
@@ -117,7 +117,7 @@ function assert_file_lines_ge() {

 function assert_metric_eq() {
     local val
-    val="$(curl --silent "http://localhost:$HTTP_PORT/metrics" \
+    val="$(curl --silent "http://$GITSYNC_IP:$HTTP_PORT/metrics" \
         | grep "^$1 " \
         | awk '{print $NF}')"
     if [[ "${val}" == "$2" ]]; then
@@ -138,6 +138,9 @@ function assert_fail() {
     )
 }

+DOCKER_SUBNET="192.168.0.0/24"
+GITSYNC_IP="192.168.0.254"
+
 # Helper: run a docker container.
 function docker_run() {
     RM="--rm"
@@ -148,6 +151,7 @@ function docker_run() {
         -d \
         ${RM} \
         --label git-sync-e2e="$RUNID" \
+        --network "e2e_$RUNID" \
         "$@"
     sleep 2 # wait for it to come up
 }
@@ -158,7 +162,8 @@ function docker_ip() {
         echo "usage: $0 <id>"
         return 1
     fi
-    docker inspect "$1" | jq -r .[0].NetworkSettings.IPAddress
+    docker inspect "$1" \
+        | jq -r ".[0].NetworkSettings.Networks.e2e_$RUNID.IPAddress"
 }

 function docker_kill() {
@@ -278,7 +283,8 @@ function GIT_SYNC() {
         -i \
         ${RM} \
         --label git-sync-e2e="$RUNID" \
-        --network="host" \
+        --network "e2e_$RUNID" \
+        --ip "$GITSYNC_IP" \
         -u git-sync:$(id -g) `# rely on GID, triggering "dubious ownership"` \
         -v "$ROOT":"$ROOT":rw \
         -v "$REPO":"$REPO":ro \
@@ -308,6 +314,9 @@ function remove_containers() {
         | while read CTR; do
             docker kill "$CTR" >/dev/null
         done
+    docker network prune -f \
+        --filter label=git-sync-e2e \
+        >/dev/null
 }

 #
@@ -2515,7 +2524,7 @@ function e2e::expose_http() {
     # do nothing, just wait for the HTTP to come up
     for i in $(seq 1 5); do
         sleep 1
-        if curl --silent --output /dev/null http://localhost:$HTTP_PORT; then
+        if curl --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT"; then
             break
         fi
         if [[ "$i" == 5 ]]; then
@@ -2524,23 +2533,23 @@ function e2e::expose_http() {
     done

     # check that health endpoint fails
-    if [[ $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT) -ne 503 ]] ; then
-        fail "health endpoint should have failed: $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT)"
+    if [[ $(curl --write-out %{http_code} --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT") -ne 503 ]] ; then
+        fail "health endpoint should have failed: $(curl --write-out %{http_code} --silent --output /dev/null http://$GITSYNC_IP:$HTTP_PORT)"
     fi
     wait_for_sync "${MAXWAIT}"

     # check that health endpoint is alive
-    if [[ $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT) -ne 200 ]] ; then
+    if [[ $(curl --write-out %{http_code} --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT") -ne 200 ]] ; then
         fail "health endpoint failed"
     fi

     # check that the metrics endpoint exists
-    if [[ $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT/metrics) -ne 200 ]] ; then
+    if [[ $(curl --write-out %{http_code} --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT/metrics") -ne 200 ]] ; then
         fail "metrics endpoint failed"
     fi

     # check that the pprof endpoint exists
-    if [[ $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT/debug/pprof/) -ne 200 ]] ; then
+    if [[ $(curl --write-out %{http_code} --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT/debug/pprof/") -ne 200 ]] ; then
         fail "pprof endpoint failed"
     fi
 }
@@ -2568,7 +2577,7 @@ function e2e::expose_http_after_restart() {
     # do nothing, just wait for the HTTP to come up
     for i in $(seq 1 5); do
         sleep 1
-        if curl --silent --output /dev/null http://localhost:$HTTP_PORT; then
+        if curl --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT"; then
             break
         fi
         if [[ "$i" == 5 ]]; then
@@ -2579,7 +2588,7 @@ function e2e::expose_http_after_restart() {
     sleep 2 # wait for first loop to confirm synced

     # check that health endpoint is alive
-    if [[ $(curl --write-out %{http_code} --silent --output /dev/null http://localhost:$HTTP_PORT) -ne 200 ]] ; then
+    if [[ $(curl --write-out %{http_code} --silent --output /dev/null "http://$GITSYNC_IP:$HTTP_PORT") -ne 200 ]] ; then
         fail "health endpoint failed"
     fi
     assert_link_exists "$ROOT/link"
@@ -3503,6 +3512,12 @@ function run_test() {
         set -o errexit
         set -o nounset
         set -o pipefail
+        docker network prune -f \
+            --filter label=git-sync-e2e \
+            >/dev/null
+        docker network create "e2e_$RUNID" \
+            --subnet "$DOCKER_SUBNET" \
+            --label git-sync-e2e="$RUNID"
         "$@"
     )
     eval "$retvar=$?"
```
@thockin thockin merged commit aa0f015 into kubernetes:master Jun 13, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants