Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3026: Add support for s3 data importer inheriting service account credentials #3433

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

russellcain
Copy link

What type of PR is this?

  • /kind feature

What this PR does / why we need it:

  • Allows for consistent use of credentials configured in the service account, for S3 data volume sources.

Which issue(s) this PR fixes:

Special notes for your reviewer:

  • Hi -- I'll add in some tests shortly, just wanted to get eyes on the core of the change to make sure I'm not barking up the wrong trunk. If we could start the conversation in the interim, I'd appreciate that.

Does this PR introduce a user-facing change?:

  • It does allow a new value in the DataVolumeSourceS3 configuration. The needed code-gen files are included in this pull request.

Release note:

Related to https://github.com/kubevirt/containerized-data-importer/issues/3429

This change stems from the desire for a spun-up S3 reader to utilize the aws credentials configured at the Service Account level. The belief is that this would promote consistency in access / minimize any duplication of aws credential keys in the configuration of this S3 CDI. For this PR, we just provide the user with the ability to specify the `ServiceAccountName` instead of an `AccessKey`/`SecretKey` combo. No core/default functionality will change for users which wish to stick with that key combo approach.  

At the importer level, this update utilizes AWS' IRSA credential chain approach. This is a stable approach that has go sdk approach. Additionally, this PR added the recommended `CredentialsChainVerboseErrors` to the aws session configuration. If reviewers worry that this will be too noisy, I am fine with it being removed.

Signed-off-by: _____

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 12, 2024
@kubevirt-bot
Copy link
Contributor

Hi @russellcain. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mhenriks for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@russellcain
Copy link
Author

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 12, 2024
@russellcain russellcain changed the title 3026: Add support for s3 data importer inheriting service account credentials [WIP] 3026: Add support for s3 data importer inheriting service account credentials Sep 12, 2024
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 12, 2024
@russellcain russellcain force-pushed the 3026/IRSA-integration-for-s3-service-accounts branch from e1f84ca to 81b3c8e Compare September 13, 2024 16:53
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 13, 2024
@russellcain
Copy link
Author

(recent force push to include signoff)
@aglitke and @ShellyKa13 thanks for taking the time to look this over! any and all pointers on how you'd like tests to be done are appreciated

@mhenriks
Copy link
Member

/test

@kubevirt-bot
Copy link
Contributor

@mhenriks: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test pull-containerized-data-importer-e2e-ceph
  • /test pull-containerized-data-importer-e2e-ceph-wffc
  • /test pull-containerized-data-importer-e2e-destructive
  • /test pull-containerized-data-importer-e2e-hpp-latest
  • /test pull-containerized-data-importer-e2e-hpp-previous
  • /test pull-containerized-data-importer-e2e-istio
  • /test pull-containerized-data-importer-e2e-nfs
  • /test pull-containerized-data-importer-e2e-upg
  • /test pull-containerized-data-importer-fossa
  • /test pull-containerized-data-importer-non-csi-hpp

The following commands are available to trigger optional jobs:

  • /test pull-cdi-apidocs
  • /test pull-cdi-generate-verify
  • /test pull-cdi-goveralls
  • /test pull-cdi-linter
  • /test pull-cdi-unit-test
  • /test pull-cdi-unit-test-s390x
  • /test pull-cdi-verify-go-mod

Use /test all to run all jobs.

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mhenriks
Copy link
Member

/test all

@kubevirt-bot kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Sep 16, 2024
Signed-off-by: russellcain <[email protected]>
@russellcain russellcain force-pushed the 3026/IRSA-integration-for-s3-service-accounts branch from a82147e to a87f77d Compare September 17, 2024 16:16
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 17, 2024
@russellcain
Copy link
Author

/test all

@kubevirt-bot
Copy link
Contributor

@russellcain: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@russellcain
Copy link
Author

@mhenriks would you mind kicking off the tests for me, please?

@mhenriks
Copy link
Member

/test all

@awels
Copy link
Member

awels commented Sep 19, 2024

/test pull-cdi-goveralls

@@ -255,6 +255,7 @@ func newDataSource(source string, contentType string, volumeMode v1.PersistentVo
ep, _ := util.ParseEnvVar(common.ImporterEndpoint, false)
acc, _ := util.ParseEnvVar(common.ImporterAccessKeyID, false)
sec, _ := util.ParseEnvVar(common.ImporterSecretKey, false)
serviceAccount, _ := util.ParseEnvVar(common.ImporterServiceAccountName, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah rats! thanks for closing this gap in my understanding -- let me take a stab at that right now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay so fingers crossed that I understood how this flows in 04d29c3

I dont need to add anything here, right?
https://github.com/russellcain/containerized-data-importer/blob/41ba02d03ce0205b38fdfc9cbae4a1dc17b3abff/tools/cdi-source-update-poller/main.go#L48
looks like more so this parses env vars (which the SA doesnt fall into) and args passed at invocation (also not applicable)?

should i be adding a check about the source here? and then set it to any value if the PodSpec has a SA set?

ds *importer.S3DataSource
err error
)
if serviceAccount != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the value of serviceAccount ever used in the aws API? Or does the aws lib assume that the pod is running as the privileged service account?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter! hopefully i was too far off base, but the approach was that this would be used as a "flag" given the service account should have been properly set up by the time it is invoked here? (or fail on referencing an un-instantiated SA)

@kubevirt-bot kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Sep 19, 2024
@russellcain russellcain force-pushed the 3026/IRSA-integration-for-s3-service-accounts branch from 04d29c3 to 4b97bf6 Compare September 19, 2024 21:17
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Sep 19, 2024
@russellcain
Copy link
Author

/retest

@kubevirt-bot
Copy link
Contributor

@russellcain: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@awels
Copy link
Member

awels commented Sep 24, 2024

/test all

@coveralls
Copy link

Coverage Status

coverage: 59.217% (-0.04%) from 59.26%
when pulling 4b97bf6 on russellcain:3026/IRSA-integration-for-s3-service-accounts
into d08e245 on kubevirt:main.

@@ -1263,6 +1264,10 @@ func makeImportEnv(podEnvVar *importPodEnvVar, uid types.UID) []corev1.EnvVar {
Name: common.CacheMode,
Value: podEnvVar.cacheMode,
},
{
Name: common.ImporterServiceAccountName,
Value: podEnvVar.serviceAccountName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple things

  1. if the worker pod does not need to know the SA name and this param is just a flag for setting sessionConfig.CredentialsChainVerboseErrors = aws.Bool(true) then why not make this a boolean value with key something like S3_CREDENTIALS_CHAIN_AUTH

  2. You have to set the service account name in the importer pod spec somewhere in here: https://github.com/russellcain/containerized-data-importer/blob/3026/IRSA-integration-for-s3-service-accounts/pkg/controller/import-controller.go#L884-L949

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi -- thanks for explaining this, i think i am almost on the same page so appreciate the patience on your end.

  • for 1, does this handle my concern posted here? I agree that, to the best of my knowledge, the SA name itself will not be meaningfully used by the worker pod and instead is just a flag to not set sessionConfig.Credentials. I guess i was hoping to make a smaller footprint / mimic other patterns where referencing/supplying a SA name in the pod spec communicates that the values of that account are consumed by the pod (since that is happening in this case, just not through an explicit invocation). If I'm talking myself into circles, please let me know, but does this make sense?

    • I also see the merit in an explicit boolean flag -- if what im explaining above comes across as anything hand-wavy, then I'd be alright with someone knowing what they are signing up for with the S3_CREDENTIALS_CHAIN_AUTH value being set. I guess I'd just be a bit worried that the hypothetical user would set this value without having set a SA? and the approach of specifying the SA name slightly safeguards against that? I might be over-worrying here.
  • for 2, happy to make this change, but wanted to confirm my understanding. are you saying we need the value set at the top level spec definition for how I'm attempting to consume it later on in cmd/cdi-importer/importer.go? that makes sense to me, but would still want to make sure i got eyes on https://github.com/kubevirt/containerized-data-importer/pull/3433/files#r1767598831

thanks again for taking your time to help me with this, @mhenriks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec: corev1.PodSpec{
			Containers:        makeImporterContainerSpec(args),
			InitContainers:    makeImporterInitContainersSpec(args),
			Volumes:           makeImporterVolumeSpec(args),
			RestartPolicy:     corev1.RestartPolicyOnFailure,
			NodeSelector:      args.workloadNodePlacement.NodeSelector,
			Tolerations:       args.workloadNodePlacement.Tolerations,
			Affinity:          args.workloadNodePlacement.Affinity,
			PriorityClassName: args.priorityClassName,
			ImagePullSecrets:  args.imagePullSecrets,
-------------> ServiceAccountName: args.serviceAccountName
		},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some documentation here explaining what happens when the serviceAccountName is added to the pod, what the pod-identity-webhook is mutating in the pod spec

@russellcain
Copy link
Author

will wait back on a response to this conversation before i push up my changes. thanks again, @mhenriks

@russellcain
Copy link
Author

thanks for the added context, @chomatdam -- can you let me know if my changes in 69501ee seem to line up with the approach you described, please @mhenriks? thanks again for your guidance on this

@russellcain
Copy link
Author

@mhenriks hi - would you mind kicking off the tests for this / taking a quick look over 69501ee when you have a chance, please? thank you!

Copy link
Member

@mhenriks mhenriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting there!

@@ -255,6 +255,7 @@ func newDataSource(source string, contentType string, volumeMode v1.PersistentVo
ep, _ := util.ParseEnvVar(common.ImporterEndpoint, false)
acc, _ := util.ParseEnvVar(common.ImporterAccessKeyID, false)
sec, _ := util.ParseEnvVar(common.ImporterSecretKey, false)
use_s3_credential_chain_auth, _ := strconv.ParseBool(os.Getenv(common.UseS3CredentialsChainAuth))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_s3_credential_chain_auth is a little verbose compared to other variable names maybe just s3ChainAuth?

extraHeaders []string
secretExtraHeaders []string
cacheMode string
use_s3_credential_chain_auth bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about s3ChainAuth as variable name?

@@ -117,6 +118,7 @@ type importerPodArgs struct {
workloadNodePlacement *sdkapi.NodePlacement
vddkImageName *string
priorityClassName string
serviceAccountName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this getting assigned?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhenriks
Copy link
Member

mhenriks commented Oct 8, 2024

/test all

@russellcain
Copy link
Author

@mhenriks may I have another review / test run, please?

@russellcain
Copy link
Author

russellcain commented Dec 10, 2024

@mhenriks hi, i was hoping to get another test run in and another crack at it before the year is out, if possible, please -- thank you

@awels
Copy link
Member

awels commented Dec 10, 2024

/test all

@kubevirt-bot
Copy link
Contributor

@russellcain: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdi-goveralls 0f92ea5 link false /test pull-cdi-goveralls
pull-cdi-unit-test-s390x 0f92ea5 link false /test pull-cdi-unit-test-s390x
pull-cdi-unit-test 0f92ea5 link false /test pull-cdi-unit-test
pull-cdi-linter 0f92ea5 link false /test pull-cdi-linter
pull-containerized-data-importer-e2e-nfs 0f92ea5 link true /test pull-containerized-data-importer-e2e-nfs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@russellcain russellcain changed the title [WIP] 3026: Add support for s3 data importer inheriting service account credentials 3026: Add support for s3 data importer inheriting service account credentials Dec 20, 2024
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 20, 2024
@russellcain
Copy link
Author

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CDI importer requires static credentials for S3 and GCS (renewed)
6 participants