Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: Fix VerifyKubernetesImages on Kubernetes versions before v1.24 with docker cruntime #17647

Merged
merged 3 commits into from
Nov 28, 2023

Conversation

prezha
Copy link
Contributor

@prezha prezha commented Nov 18, 2023

fixes #17646

TestStartStop/old-k8s-version/VerifyKubernetesImages uses constants.OldestKubernetesVersion, that is currently v1.16.0, so if docker is used as the container runtime, it would imply using dockershim, in which case, as described in the issue, crictl fails to list images:

FATA[0000] validate service connection: validate CRI v1 image API for endpoint "unix:///var/run/dockershim.sock": rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService

why?

we're currently using crictl version v1.28.0 and cri-tools switched to cri v1 api in v1.24.0

on a side note, based on the compatibility matrix, we can expect issues with versions mismatch:

It's recommended to use the same cri-tools and Kubernetes minor version, because new features added to the Container Runtime Interface (CRI) may not be fully supported if they diverge.

this pr should fix the tests for (really) old k8s versions, but we should also consider bumping the oldest version from the current v1.16 (i think we aim to support the last six versions?) or perhaps try to hold on until v1.29 and then start removing everything docker-related by supporting only k8s v1.24+

after

local test run:

$ make integration -e TEST_ARGS="-minikube-start-args='--driver=kvm2 --container-runtime=docker --alsologtostderr -v=7' -test.run TestStartStop/group/old-k8s-version --cleanup=false"
go test -ldflags="-X k8s.io/minikube/pkg/version.version=v1.32.0 -X k8s.io/minikube/pkg/version.isoVersion=v1.32.1-1699648094-17581 -X k8s.io/minikube/pkg/version.gitCommitID="8ac4c93b6f3318c0f631b68a0c8f5399f45a5807-dirty" -X k8s.io/minikube/pkg/version.storageProvisionerVersion=v5" -v -test.timeout=90m ./test/integration --tags="integration " -minikube-start-args='--driver=kvm2 --container-runtime=docker --alsologtostderr -v=7' -test.run TestStartStop/group/old-k8s-version --cleanup=false 2>&1 | tee "./out/testout_8ac4c93b6.txt"
Found 16 cores, limiting parallelism with --test.parallel=9
=== RUN   TestStartStop
=== PAUSE TestStartStop
=== CONT  TestStartStop
=== RUN   TestStartStop/group
=== RUN   TestStartStop/group/old-k8s-version
=== PAUSE TestStartStop/group/old-k8s-version
=== CONT  TestStartStop/group/old-k8s-version
=== RUN   TestStartStop/group/old-k8s-version/serial
=== RUN   TestStartStop/group/old-k8s-version/serial/FirstStart
    start_stop_delete_test.go:186: (dbg) Run:  out/minikube start -p old-k8s-version-470451 --memory=2200 --alsologtostderr --wait=true --kvm-network=default --kvm-qemu-uri=qemu:///system --disable-driver-mounts --keep-context=false --driver=kvm2 --container-runtime=docker --alsologtostderr -v=7 --kubernetes-version=v1.16.0
    start_stop_delete_test.go:186: (dbg) Done: out/minikube start -p old-k8s-version-470451 --memory=2200 --alsologtostderr --wait=true --kvm-network=default --kvm-qemu-uri=qemu:///system --disable-driver-mounts --keep-context=false --driver=kvm2 --container-runtime=docker --alsologtostderr -v=7 --kubernetes-version=v1.16.0: (5m33.768944361s)
=== RUN   TestStartStop/group/old-k8s-version/serial/DeployApp
    start_stop_delete_test.go:196: (dbg) Run:  kubectl --context old-k8s-version-470451 create -f testdata/busybox.yaml
    start_stop_delete_test.go:196: (dbg) TestStartStop/group/old-k8s-version/serial/DeployApp: waiting 8m0s for pods matching "integration-test=busybox" in namespace "default" ...
    helpers_test.go:344: "busybox" [55cac8da-e7f6-4ba0-a9bf-9bf6c6961e47] Pending
    helpers_test.go:344: "busybox" [55cac8da-e7f6-4ba0-a9bf-9bf6c6961e47] Pending / Ready:ContainersNotReady (containers with unready status: [busybox]) / ContainersReady:ContainersNotReady (containers with unready status: [busybox])
    helpers_test.go:344: "busybox" [55cac8da-e7f6-4ba0-a9bf-9bf6c6961e47] Running
    start_stop_delete_test.go:196: (dbg) TestStartStop/group/old-k8s-version/serial/DeployApp: integration-test=busybox healthy within 11.034185001s
    start_stop_delete_test.go:196: (dbg) Run:  kubectl --context old-k8s-version-470451 exec busybox -- /bin/sh -c "ulimit -n"
=== RUN   TestStartStop/group/old-k8s-version/serial/EnableAddonWhileActive
    start_stop_delete_test.go:205: (dbg) Run:  out/minikube addons enable metrics-server -p old-k8s-version-470451 --images=MetricsServer=registry.k8s.io/echoserver:1.4 --registries=MetricsServer=fake.domain
    start_stop_delete_test.go:215: (dbg) Run:  kubectl --context old-k8s-version-470451 describe deploy/metrics-server -n kube-system
=== RUN   TestStartStop/group/old-k8s-version/serial/Stop
    start_stop_delete_test.go:228: (dbg) Run:  out/minikube stop -p old-k8s-version-470451 --alsologtostderr -v=3
    start_stop_delete_test.go:228: (dbg) Done: out/minikube stop -p old-k8s-version-470451 --alsologtostderr -v=3: (12.169758301s)
=== RUN   TestStartStop/group/old-k8s-version/serial/EnableAddonAfterStop
    start_stop_delete_test.go:239: (dbg) Run:  out/minikube status --format={{.Host}} -p old-k8s-version-470451 -n old-k8s-version-470451
    start_stop_delete_test.go:239: (dbg) Non-zero exit: out/minikube status --format={{.Host}} -p old-k8s-version-470451 -n old-k8s-version-470451: exit status 7 (59.659779ms)

        -- stdout --
                Stopped

        -- /stdout --
    start_stop_delete_test.go:239: status error: exit status 7 (may be ok)
    start_stop_delete_test.go:246: (dbg) Run:  out/minikube addons enable dashboard -p old-k8s-version-470451 --images=MetricsScraper=registry.k8s.io/echoserver:1.4
=== RUN   TestStartStop/group/old-k8s-version/serial/SecondStart
    start_stop_delete_test.go:256: (dbg) Run:  out/minikube start -p old-k8s-version-470451 --memory=2200 --alsologtostderr --wait=true --kvm-network=default --kvm-qemu-uri=qemu:///system --disable-driver-mounts --keep-context=false --driver=kvm2 --container-runtime=docker --alsologtostderr -v=7 --kubernetes-version=v1.16.0
    start_stop_delete_test.go:256: (dbg) Done: out/minikube start -p old-k8s-version-470451 --memory=2200 --alsologtostderr --wait=true --kvm-network=default --kvm-qemu-uri=qemu:///system --disable-driver-mounts --keep-context=false --driver=kvm2 --container-runtime=docker --alsologtostderr -v=7 --kubernetes-version=v1.16.0: (7m27.39681397s)
    start_stop_delete_test.go:262: (dbg) Run:  out/minikube status --format={{.Host}} -p old-k8s-version-470451 -n old-k8s-version-470451
=== RUN   TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop
    start_stop_delete_test.go:274: (dbg) TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop: waiting 9m0s for pods matching "k8s-app=kubernetes-dashboard" in namespace "kubernetes-dashboard" ...
    helpers_test.go:344: "kubernetes-dashboard-84b68f675b-tmzt6" [e43db56b-9a8f-40af-bffa-6ad8a425a1a4] Running
    start_stop_delete_test.go:274: (dbg) TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop: k8s-app=kubernetes-dashboard healthy within 5.015514078s
=== RUN   TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop
    start_stop_delete_test.go:287: (dbg) TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop: waiting 9m0s for pods matching "k8s-app=kubernetes-dashboard" in namespace "kubernetes-dashboard" ...
    helpers_test.go:344: "kubernetes-dashboard-84b68f675b-tmzt6" [e43db56b-9a8f-40af-bffa-6ad8a425a1a4] Running
    start_stop_delete_test.go:287: (dbg) TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop: k8s-app=kubernetes-dashboard healthy within 5.015978758s
    start_stop_delete_test.go:291: (dbg) Run:  kubectl --context old-k8s-version-470451 describe deploy/dashboard-metrics-scraper -n kubernetes-dashboard
=== RUN   TestStartStop/group/old-k8s-version/serial/VerifyKubernetesImages
    start_stop_delete_test.go:304: (dbg) Run:  out/minikube -p old-k8s-version-470451 kubectl --ssh -- "get nodes -o wide"
    start_stop_delete_test.go:304: (dbg) Run:  out/minikube -p old-k8s-version-470451 ssh sudo systemctl restart cri-docker.socket
    start_stop_delete_test.go:304: (dbg) Run:  out/minikube ssh -p old-k8s-version-470451 "sudo crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock images -o json"
    start_stop_delete_test.go:304: Found non-minikube image: gcr.io/k8s-minikube/busybox:1.28.4-glibc
=== RUN   TestStartStop/group/old-k8s-version/serial/Pause
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube pause -p old-k8s-version-470451 --alsologtostderr -v=1
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube status --format={{.APIServer}} -p old-k8s-version-470451 -n old-k8s-version-470451
    start_stop_delete_test.go:311: (dbg) Non-zero exit: out/minikube status --format={{.APIServer}} -p old-k8s-version-470451 -n old-k8s-version-470451: exit status 2 (326.366721ms)

        -- stdout --
                Paused

        -- /stdout --
    start_stop_delete_test.go:311: status error: exit status 2 (may be ok)
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube status --format={{.Kubelet}} -p old-k8s-version-470451 -n old-k8s-version-470451
    start_stop_delete_test.go:311: (dbg) Non-zero exit: out/minikube status --format={{.Kubelet}} -p old-k8s-version-470451 -n old-k8s-version-470451: exit status 2 (290.241758ms)

        -- stdout --
                Stopped

        -- /stdout --
    start_stop_delete_test.go:311: status error: exit status 2 (may be ok)
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube unpause -p old-k8s-version-470451 --alsologtostderr -v=1
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube status --format={{.APIServer}} -p old-k8s-version-470451 -n old-k8s-version-470451
    start_stop_delete_test.go:311: (dbg) Run:  out/minikube status --format={{.Kubelet}} -p old-k8s-version-470451 -n old-k8s-version-470451
=== NAME  TestStartStop/group/old-k8s-version
    helpers_test.go:183: skipping cleanup of old-k8s-version-470451 (--cleanup=false)
--- PASS: TestStartStop (818.95s)
    --- PASS: TestStartStop/group (0.00s)
        --- PASS: TestStartStop/group/old-k8s-version (818.95s)
            --- PASS: TestStartStop/group/old-k8s-version/serial (818.95s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/FirstStart (333.77s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/DeployApp (11.35s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/EnableAddonWhileActive (0.69s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/Stop (12.17s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/EnableAddonAfterStop (0.18s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/SecondStart (447.67s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop (5.02s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop (5.10s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/VerifyKubernetesImages (0.81s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/Pause (2.19s)
PASS
Tests completed in 13m38.949498407s (result code 0)
ok      k8s.io/minikube/test/integration        818.978s

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 18, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: prezha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2023
@prezha
Copy link
Contributor Author

prezha commented Nov 18, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Nov 18, 2023
@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@afbjorklund
Copy link
Collaborator

The cri-tools and cni-plugins versions need to be modified, to match the kubernetes version.

This was a change made in 1.28 (to fork the packages per release), but also applies to older...

@spowelljr spowelljr changed the title handle kubernetes versions before v1.24 with docker as container runtime testing: Fix VerifyKubernetesImages on Kubernetes versions before v1.24 with docker cruntime Nov 21, 2023
@@ -352,7 +352,23 @@ func testPulledImages(ctx context.Context, t *testing.T, profile, version string
t.Helper()
defer PostMortemLogs(t, profile)

rr, err := Run(t, exec.CommandContext(ctx, Target(), "ssh", "-p", profile, "sudo crictl images -o json"))
cmd := "sudo crictl images -o json"
Copy link
Member

@spowelljr spowelljr Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using minikube image list --format=json instead? We're kind of hacking behind the scenes when this functionality is already built into minikube. The command will use the cruntime to get the image list and will prevent a lot of this complication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spowelljr thanks for the review and comment!

my thinking was:

i wanted to make a minimal change in the hope that we'll deprecate dockershim at some point soon and then just delete this "workaround" block (as part of a bigger refactor/pruning of related code)
i'd also want to avoid testing a minikube functionality by using another minikube functionality (the "image list" subcommand in this case) as if a test fails, it would not be so obvious what failed specifically and potentially make debugging a bit harder
then i thought about using docker images (being the cruntime in this case) directly, but that would require output processing similar to what we do in the minikube image list, effectively duplicating that, which made little sense
so, initially, i opted to just keep the crictl images that we use in this test...

but, at the end of the day, these are integration tests, so it wouldn't harm much i guess (as long as we keep in mind the above), so i amended the pr using minikube image list instead :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local test results:

$ make integration -e TEST_ARGS="-minikube-start-args='--driver=kvm2 --container-runtime=docker --alsologtostderr -v=7' -test.run TestStartStop/group/old-k8s-version --cleanup=false"
...
--- PASS: TestStartStop (638.84s)
    --- PASS: TestStartStop/group (0.00s)
        --- PASS: TestStartStop/group/old-k8s-version (638.84s)
            --- PASS: TestStartStop/group/old-k8s-version/serial (638.84s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/FirstStart (160.85s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/DeployApp (9.26s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/EnableAddonWhileActive (0.78s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/Stop (12.21s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/EnableAddonAfterStop (0.21s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/SecondStart (442.66s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/UserAppExistsAfterStop (5.02s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/AddonExistsAfterStop (5.08s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/VerifyKubernetesImages (0.25s)
                --- PASS: TestStartStop/group/old-k8s-version/serial/Pause (2.53s)
PASS
Tests completed in 10m38.840530637s (result code 0)
ok      k8s.io/minikube/test/integration        638.871s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relying on another minikube command is valid, I'm fine with either solution. Once we bump the minimum minikube version I'm fine reverting this PR as well. But if you're happy with it in its current state I'm happy to merge it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure - both would work
i think we can stick with the minikube image list for now and switch back to the crictl images once we bump the minimum supported k8s version of 1.24+
i've added a TODO comment there, so we don't forget - i think we're good to go with this one ;)

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17647) |
+----------------+----------+---------------------+
| minikube start | 50.3s    | 52.2s               |
| enable ingress | 25.9s    | 24.6s               |
+----------------+----------+---------------------+

Times for minikube start: 50.9s 48.2s 51.0s 51.4s 49.9s
Times for minikube (PR 17647) start: 51.0s 51.0s 54.3s 53.2s 51.4s

Times for minikube ingress: 23.1s 27.1s 24.6s 26.7s 28.2s
Times for minikube (PR 17647) ingress: 24.6s 24.1s 27.1s 23.7s 23.7s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17647) |
+----------------+----------+---------------------+
| minikube start | 23.2s    | 23.1s               |
| enable ingress | 20.4s    | 20.1s               |
+----------------+----------+---------------------+

Times for minikube start: 24.6s 22.5s 22.3s 25.3s 21.1s
Times for minikube (PR 17647) start: 24.6s 22.4s 25.4s 22.1s 20.9s

Times for minikube ingress: 20.8s 20.8s 18.8s 20.8s 20.8s
Times for minikube (PR 17647) ingress: 18.3s 18.4s 20.9s 20.9s 21.9s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17647) |
+----------------+----------+---------------------+
| minikube start | 22.9s    | 21.8s               |
| enable ingress | 31.3s    | 31.3s               |
+----------------+----------+---------------------+

Times for minikube start: 24.1s 23.7s 23.8s 20.5s 22.6s
Times for minikube (PR 17647) start: 20.8s 21.1s 23.1s 22.8s 21.1s

Times for minikube ingress: 31.3s 31.3s 31.3s 31.3s 31.3s
Times for minikube (PR 17647) ingress: 31.3s 30.3s 31.4s 32.3s 31.3s

@minikube-pr-bot
Copy link

These are the flake rates of all failed tests.

Environment Failed Tests Flake Rate (%)
none_Linux TestDownloadOnly/v1.16.0/binaries (gopogh) 2.68 (chart)
none_Linux TestDownloadOnly/v1.16.0/json-events (gopogh) 2.68 (chart)

To see the flake rates of all tests by environment, click here.

@medyagh medyagh merged commit 638da15 into kubernetes:master Nov 28, 2023
23 of 38 checks passed
@medyagh
Copy link
Member

medyagh commented Nov 28, 2023

thanks @prezha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
6 participants