Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use serviceAccount for Ray from the propeller configuration #5821

Closed
wants to merge 1 commit into from

Conversation

peterghaddad
Copy link
Contributor

@peterghaddad peterghaddad commented Oct 7, 2024

Tracking issue

Closes #5820

Why are the changes needed?

Currently, the serviceAccount pulls from the securityContext of the k8s cluster that the FlytePropeller is deployed at.
Instead, have the option to configure the propeller to use a specific serviceAccountName on k8s when the configuration value is set to a value other than "default".

What changes were proposed in this pull request?

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Signed-off-by: peterghaddad <[email protected]>
Copy link

codecov bot commented Oct 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 36.35%. Comparing base (b9900fe) to head (385c34e).
Report is 69 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5821   +/-   ##
=======================================
  Coverage   36.35%   36.35%           
=======================================
  Files        1304     1304           
  Lines      110147   110147           
=======================================
+ Hits        40042    40045    +3     
+ Misses      65938    65935    -3     
  Partials     4167     4167           
Flag Coverage Δ
unittests-datacatalog 51.37% <ø> (ø)
unittests-flyteadmin 55.60% <ø> (ø)
unittests-flytecopilot 12.17% <ø> (ø)
unittests-flytectl 62.26% <ø> (+0.04%) ⬆️
unittests-flyteidl 7.17% <ø> (ø)
unittests-flyteplugins 53.35% <100.00%> (ø)
unittests-flytepropeller 42.02% <ø> (ø)
unittests-flytestdlib 55.37% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@peterghaddad
Copy link
Contributor Author

In flyteplugins/go/tasks/plugins/k8s/ray/ray.go, I noticed that submitterPodTemplate := buildSubmitterPodTemplate(headPodSpec, objectMeta, taskCtx) uses the headPodSpec defined early in the source. See here. Even when the configuration is changed it doesn't propagate down when calling podSpec.ServiceAccountName = cfg.ServiceAccount

Should we be using the built out headGroupSpec instead of the one originally defined? See here.

@eapolinario any thoughts here? We also observed that modifying the serviceAccount through the podTemplate in the Flyte sdk does not propagate.

@eapolinario
Copy link
Contributor

Even when the configuration is changed it doesn't propagate down when calling podSpec.ServiceAccountName = cfg.ServiceAccount

The issue is that the call to GetServiceAccountNameFromTaskExecutionMetadata always gets the security context from task execution metadata, which has the run_as set.

So even though we set the service account in

podSpec.ServiceAccountName = cfg.ServiceAccount
this ends up being overridden in the call to constructRayJob.

As you're proposing in this PR, the crux of the change should be around

serviceAccountName := flytek8s.GetServiceAccountNameFromTaskExecutionMetadata(taskCtx.TaskExecutionMetadata())
if len(serviceAccountName) == 0 {
serviceAccountName = cfg.ServiceAccount
}

What if we used only used the service account name from the config if that set at all (i.e. instead of relying on setting its default value to "default" make that "") and if that's not set then we call serviceAccountName = flytek8s.GetServiceAccountNameFromTaskExecutionMetadata(taskCtx.TaskExecutionMetadata()).

@peterghaddad , wdyt?

Also, cc: @pingsutw

@peterghaddad
Copy link
Contributor Author

Even when the configuration is changed it doesn't propagate down when calling podSpec.ServiceAccountName = cfg.ServiceAccount

The issue is that the call to GetServiceAccountNameFromTaskExecutionMetadata always gets the security context from task execution metadata, which has the run_as set.

So even though we set the service account in

podSpec.ServiceAccountName = cfg.ServiceAccount

this ends up being overridden in the call to constructRayJob.
As you're proposing in this PR, the crux of the change should be around

serviceAccountName := flytek8s.GetServiceAccountNameFromTaskExecutionMetadata(taskCtx.TaskExecutionMetadata())
if len(serviceAccountName) == 0 {
serviceAccountName = cfg.ServiceAccount
}

What if we used only used the service account name from the config if that set at all (i.e. instead of relying on setting its default value to "default" make that "") and if that's not set then we call serviceAccountName = flytek8s.GetServiceAccountNameFromTaskExecutionMetadata(taskCtx.TaskExecutionMetadata()).
@peterghaddad , wdyt?

Also, cc: @pingsutw

I like this approach. I'll submit a PR for this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Add serviceAccountName override for Ray
2 participants