Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cachi2 config passing in prefetch-dependencies #1169

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

eskultety
Copy link
Contributor

Allow consumers to pass a cachi2 config to the tool in the prefetch task.

Before you complete this pull request ...

Look for any open pull requests in the repository with the title "e2e-tests update" and
see if there are recent e2e-tests updates that will be applicable to your change.

@chmeliik
Copy link
Contributor

The cachi2 config doesn't contain any options that would compromise SBOM accuracy, right?

Maybe the goproxy_url, since that would allow the user to make cachi2 download through a private proxy. But I think the purl spec doesn't account for that anyway :-/

@eskultety
Copy link
Contributor Author

The cachi2 config doesn't contain any options that would compromise SBOM accuracy, right?

Maybe the goproxy_url, since that would allow the user to make cachi2 download through a private proxy. But I think the purl spec doesn't account for that anyway :-/

From PURL spec you wouldn't really know. As for the options, I'd imagine the idea to be that options can only modify the behaviour without deliberately causing a regressive behaviour on the output, IOW I can imagine existence of options improving the quality of the SBOM, not decreasing it, especially as a side effect.

@chmeliik
Copy link
Contributor

The cachi2 config doesn't contain any options that would compromise SBOM accuracy, right?
Maybe the goproxy_url, since that would allow the user to make cachi2 download through a private proxy. But I think the purl spec doesn't account for that anyway :-/

From PURL spec you wouldn't really know. As for the options, I'd imagine the idea to be that options can only modify the behaviour without deliberately causing a regressive behaviour on the output, IOW I can imagine existence of options improving the quality of the SBOM, not decreasing it, especially as a side effect.

So what does all this mean for the goproxy_url option?

@eskultety
Copy link
Contributor Author

The cachi2 config doesn't contain any options that would compromise SBOM accuracy, right?
Maybe the goproxy_url, since that would allow the user to make cachi2 download through a private proxy. But I think the purl spec doesn't account for that anyway :-/

From PURL spec you wouldn't really know. As for the options, I'd imagine the idea to be that options can only modify the behaviour without deliberately causing a regressive behaviour on the output, IOW I can imagine existence of options improving the quality of the SBOM, not decreasing it, especially as a side effect.

So what does all this mean for the goproxy_url option?

What it probably means is that cachi2 will either have to abuse the repository_url/download_url qualifier in the PURL if goproxy_url has been set OR have to invent a custom qualifier, e.g. proxy= in the Golang PURL (slightly preferable). However, just like with any privately hosted artifacts, cachi2 is only ever going to be able to report the "actual" locations where a machinery will need to be in place to convert them to "apparent" locations so as not to disclose private unresolvable URLs.

Zooming out slightly, should the goproxy_url be considered a blocker for this PR?

@chmeliik
Copy link
Contributor

Zooming out slightly, should the goproxy_url be considered a blocker for this PR?

Not a blocker as long as it's tracked. Removing the option could be a good start 🙂

@eskultety
Copy link
Contributor Author

Zooming out slightly, should the goproxy_url be considered a blocker for this PR?

Not a blocker as long as it's tracked. Removing the option could be a good start 🙂

Okay, I can 'sed' it out for now.

@chmeliik
Copy link
Contributor

Zooming out slightly, should the goproxy_url be considered a blocker for this PR?

Not a blocker as long as it's tracked. Removing the option could be a good start 🙂

Okay, I can 'sed' it out for now.

I meant removing it from cachi2 until it's needed / until cachi2 can report it. But yeah, trying to avoid usage of it until it can be removed from cachi2 is probably a good idea.

@eskultety
Copy link
Contributor Author

eskultety commented Jul 17, 2024

Zooming out slightly, should the goproxy_url be considered a blocker for this PR?

Not a blocker as long as it's tracked. Removing the option could be a good start 🙂

Okay, I can 'sed' it out for now.

I meant removing it from cachi2 until it's needed / until cachi2 can report it. But yeah, trying to avoid usage of it until it can be removed from cachi2 is probably a good idea.

Oh, then I misunderstood. That would break backwards compatibility though, in general we don't know whether any option is in active use anywhere (like yeah, nobody is likely using this one, but still...), so disabling it in the tekton task seems like a more practical workaround that can be replaced with a proper solution at any time.

@chmeliik
Copy link
Contributor

chmeliik commented Jul 17, 2024

Oh, then I misunderstood. That would break backwards compatibility though, in general we don't know whether any option is in active use anywhere (like yeah, nobody is likely using this one, but still...), so disabling it in the tekton task seems like a more practical workaround that can be replaced with a proper solution at any time.

Yeah, that's fine. I'll just mention that the sed command will have to be pretty gnarly if we want it robust (one could e.g. use YAML multiline strings to try and avoid detection, pass the file formatted as JSON, ...). Maybe we just shouldn't bother with it and accept that it's an inherent shortcoming in the current version of cachi2

@eskultety
Copy link
Contributor Author

eskultety commented Jul 17, 2024

Oh, then I misunderstood. That would break backwards compatibility though, in general we don't know whether any option is in active use anywhere (like yeah, nobody is likely using this one, but still...), so disabling it in the tekton task seems like a more practical workaround that can be replaced with a proper solution at any time.

Yeah, that's fine. I'll just mention that the sed command will have to be pretty gnarly if we want it robust (one could e.g. use YAML multiline strings to try and avoid detection, pass the file formatted as JSON, ...).

Hmm, is yq pre-installed in the prefetch env? Because if it is and provided the file is formatted correctly within the format spec, how about this:

$ cat $config_file
requests_timeout: |
  1200
  foo 300

foo: >
  bar
  baz

$ yq --output-format props $config_file | sed '/requests_timeout/d' | yq --input-format props --output-format yaml > config.yaml

$ cat config.yaml
foo: |
  bar baz

That way we're relying on yq doing the heavy lifting and with props output format it seems we should always get 1 line per option. In case you want some extra handling of random garbage being passed then I guess we could prepend

yq -pa e $config_file

EDIT: yq can apparently manipulate the content directly, so not even sed is needed -> yq e del(.requests_timeout) $config

and see if yq can chew it and spit out something, not sure how far we want to take this though. Oh yeah, and this all just assumes we have more than just basic shell builtins and commands available.

@chmeliik
Copy link
Contributor

Hmm, is yq pre-installed in the prefetch env?

yq would make this very easy, yeah: yq < $config_file 'del(.goproxy_url)'

But it is not installed in the prefetch env (the cachi2 container)

- image: quay.io/redhat-appstudio/cachi2:0.8.0@sha256:5cf15d6f3fb151a3e12c8a17024062b7cc62b0c3e1b165e4a9fa5bf7a77bdc30

There is a konflux-ci/yq container though:

image: quay.io/konflux-ci/yq@sha256:974dea6375ee9df561ffd3baf994db2b61777a71f3bcf0050c5dca91ac9b3430

To make use of it, you would have to:

  • add a step that uses this image, deletes the option from the config file and writes it to a shared directory
    • first, define that shared directory, as is done for example here (the volumes section declares the volume, stepTemplate makes sure it gets mounted to every step):
      volumes:
      - name: trusted-ca
      configMap:
      items:
      - key: $(params.caTrustConfigMapKey)
      path: ca-bundle.crt
      name: $(params.caTrustConfigMapName)
      optional: true
      - name: workdir
      emptyDir: {}
      workspaces:
      - name: git-basic-auth
      description: |
      A Workspace containing a .gitconfig and .git-credentials file or username and password.
      These will be copied to the user's home before any cachi2 commands are run. Any
      other files in this Workspace are ignored. It is strongly recommended
      to bind a Secret to this Workspace over other volume types.
      optional: true
      - name: netrc
      description: |
      Workspace containing a .netrc file. Cachi2 will use the credentials in this file when
      performing http(s) requests.
      optional: true
      stepTemplate:
      volumeMounts:
      - mountPath: /var/workdir
      name: workdir
  • in the second step (the current prefetch-dependencies step), pass --config-file=<path in shared directory>

@eskultety eskultety force-pushed the cachi2-config-file branch from 7cba0d3 to 226b65b Compare July 18, 2024 11:51
@eskultety
Copy link
Contributor Author

@acmiel thanks for the guidance, I tried adjusting the task as advised.

@eskultety eskultety force-pushed the cachi2-config-file branch from 1d42045 to 81965b5 Compare July 18, 2024 11:53
@eskultety eskultety force-pushed the cachi2-config-file branch 2 times, most recently from ca4addd to e046b0c Compare July 19, 2024 11:00
@chmeliik
Copy link
Contributor

/ok-to-test

Some behaviour configuring options are (rightfully) not exposed via CLI
options, e.g. setting a timeout on HTTP requests which may be useful
for users to set on slower connections and large artifact downloads
where the default backend timeouts are simply not long enough.
Allow consumers to pass a configuration YAML file to cachi2 to tweak
supported behavioural traits.

Signed-off-by: Erik Skultety <[email protected]>
@eskultety eskultety force-pushed the cachi2-config-file branch from e046b0c to 17a20ee Compare July 19, 2024 12:54
Copy link
Contributor

@chmeliik chmeliik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@chmeliik chmeliik added this pull request to the merge queue Jul 19, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 19, 2024
@chmeliik chmeliik added this pull request to the merge queue Jul 22, 2024
Merged via the queue into konflux-ci:main with commit 85ef583 Jul 22, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants