-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nodejs extension: Failed to extend build image #42
Comments
From Ozzy google suggests that error might be from (micro) dnf failing to complete the install.. I do spot a couple of errors re groupadd not being known further up, maybe the image needs shadow-utils .. @mhdawson FYI: @BarDweller |
As the Tekton pipeline using nodejs extension is working locally on kind+tekton on my macbook, why does it fail when it is executed on RHTAP + ocp4 ? Is it because kind bootstrap a k8s cluster on ubuntu Remark: I executed locally the tekton pipeline and extension phase installs well the following package:
I dont see such error messages locally
but instead
|
the list of what is installed is here: https://github.com/paketo-community/ubi-nodejs-extension/blob/926ce866b8142996dda3cadaefa5c2233c3df852/generate.go#L19. It is not specifically installed as it is not in that list. It may be a depdenency of one of the packages that is installed. Is the list of "Installing: XXX" exactly the same in the two cases up until you see the failure? |
From a quick comparison, the lists are the same from test executed on RHTAP vs locally. |
I will try to run the test case on ocp4 + tekton to see what we have as error and if this is related to runAsUser: 0 and runAsGroup: 0 |
After setting up an environment were I could reproduce and discussion swith @cmoulliard, this is my understanding of where this one stands:
Next step is that @cmoulliard needs to get some help from Ops Container engineers to figure out how to set up the environment properly. |
The problem could be easily reproduced without using Buildpack as mentioned by @mhdawson. Use the following PipelineRun and deploy it on a minikube OR k8s kind cluster vs ocp4 and you will see that process is working on local k8s cluster
but fails on ocp4 cluster
Script used:
|
We can even reproduce the problem without using Tekton.
|
There is something that I dont understand on ocp4. We are getting the error even if: Log of the execution of the pod
|
I found an interesting error message if we install the same packages using
|
Why do we have to install the package |
at least for the prototype build/run images, we required |
as for cpio cap_set_file failed.. this has to be environmental based on that we don't see the error in other environments (eg, docker is ok, your local kind is ok etc).. googling for the error brings back stuff from 5+ years ago with fedora, but the error is explained as basically the filesystem doesnt support the operation being requested.. given we're now in 2023, I suspect this would come down to one of two possibilities.. 1) whatever storage type or filesystem is being used in the container env doesn't support those operations (feels unlikely, but plausible as I doubt this kind of operation is common).. or 2) the pod needs additional permissions granted to allow this type of operation.. did a quick google on "cpio: cap_set_file pod permissions" and came across containers/podman#5364 which suggests a particular permission to grant.. but I suspect if we go that way we're going to play whack-a-mole with each perm.. and then https://discuss.linuxcontainers.org/t/cpio-cap-set-file/472 which suggests using a privileged container (which wouldn't check this stuff anyway). .. Privileged might be a bit 'too open' for the tastes of ocp/rhtap tho, so it may be worth trying the individual perm. building apps in a pod where the pod is expected to install rpms is always going to need decent perms.. so I wonder where the middle ground sits between an app hosting cluster that has to be restricted for safety, and a dev cluster/build cluster that needs a little more freedom |
I created a scc to add more capabilities but without success
|
Does buildpacks execute Looking at our buildah task in the OpenShift Pipelines catalog, it only asks for |
I think that it runs within the main container. Do you confirm @BarDweller ? |
I'd suspect it's within the main container... we're talking about the extender lifecycle binary that internally uses Kaniko to extend the container that's running using Dockerfiles .. so unless Kaniko is doing something odd, its likely in the main container. I think you worked on the prototype code for this @cmoulliard ? |
I do my test using latest released lifecycle including extension |
I did a new test @adambkaplan using the scc definition you proposed and that fails too on ocp4.
|
I finally fixed the issue using the following config, part of the pod container. REMARK: Passing a scc to a pod using the ServiceAccount is not the way to go according to : https://issues.redhat.com/browse/OCPBUGS-19439
Question: Should we use this securityContext's config for the extender running on k8s/ocp or explore another approach (which one) ? @adambkaplan @BarDweller @mhdawson As we can expect, that will fail on RHTAP as not supported |
I only glanced at this but I think actually you don't want to grant (I may be wrong in some details of this...ultimately basically user namespaces are the real solution to a lot of things like this though) |
Can you elaborate what a user namespace is and how that could be created/managed on OpenShift, RHTAP ? @cgwalters |
Issue
When the tekton Buildspack extension pipelineRun is executed on RHTAP, then a nodejs build is raising this error.
The text was updated successfully, but these errors were encountered: