feat(backend): Add Semaphore and Mutex fields to Workflow CR #11370

DharmitD · 2024-11-12T14:36:24Z

Resolves #6553

Description of your changes:
This PR introduces support for Pipeline-level Semaphores and Mutexes in the KFP backend.

Changes Introduced:

Added the ability to specify a semaphore for pipelines, which controls the number of concurrent instances of a pipeline that can run. The semaphore is configured via a fixed ConfigMap named semaphore-config. The semaphore key is provided through the pipeline configuration.
Added mutex support for pipelines, ensuring that only one instance of the pipeline can run at a time if the specified mutex is defined. Mutex names are defined per pipeline, and each pipeline instance respects the specified mutex.
The Workflow CR now includes a Synchronization field, where semaphore and mutex are appropriately set.
If a pipeline has a semaphore, the backend maps the semaphore to the semaphore-config ConfigMap using the key provided by the user. Mutexes are represented by their name, ensuring mutual exclusion.

This PR should be merged only after #11340 gets merged.
Testing instructions

Build the API Server image and push to an image registry
Upload main.yaml file from here
Check in KFP UI Pipeline Spec tab if the following snippet is present:

platforms:
  kubernetes:
    pipelineConfig:
      mutexName: mutex
      semaphoreKey: semaphore

After the pipeline run is initiated, use the following command to verify that the Workflow CR has the appropriate synchronization settings:

oc get workflow -o yaml $(oc get workflow --no-headers | awk '{print $1}') | yq .spec.synchronization

The expected output should include the semaphore and mutex references:

synchronization:
    mutex:
      name: mutex
    semaphore:
      configMapKeyRef:
        key: semaphore
        name: semaphore-config

Checklist:

You have signed off your commits
The title for your pull request (PR) should follow our title convention. Learn more about the pull request title convention used in this repository.

google-oss-prow · 2024-11-12T14:36:34Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign chensun for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

backend/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gregsheremeta · 2024-11-13T12:19:21Z

add fixes #6553 to the PR description

gregsheremeta · 2024-11-13T12:20:38Z

The semaphore is configured via a fixed ConfigMap named semaphore-config

We should edit the kubeflow manifest to deploy a skeleton of this configmap. You can do that in here or in a follow-up PR.

gregsheremeta · 2024-11-13T12:22:14Z

The Workflow CR now includes a Synchronization field

I would probably delete this line (and maybe edit the PR title), because that reads like things you enhanced on Workflow itself. We're just setting fields on it...

gregsheremeta · 2024-11-13T12:25:09Z

platforms:
  kubernetes:
    pipelineConfig:
      mutexName: mutex
      semaphoreKey: semaphore

The expected output should include the semaphore and mutex references:

What does Argo Workflows do when both are set?

A better verification would be to do two separate test pipelines -- one where you use mutex, and one where you use semaphore. And then in addition to verifying the Workflow yaml, also verify that multiple runs are being locked like they should be.

backend/src/apiserver/template/v2_template.go

gregsheremeta · 2024-11-13T12:28:55Z

backend/src/apiserver/template/v2_template.go

@@ -77,9 +77,20 @@ func (t *V2Spec) ScheduledWorkflow(modelJob *model.Job) (*scheduledworkflow.Sche
 		}
 	}

+	var pipeline_options argocompiler.Options
+	for _, platform := range t.platformSpec.Platforms {
+		if platform.PipelineConfig.SemaphoreKey != "" || platform.PipelineConfig.MutexName != "" {


We should specifically check for the "kubernetes" platform here. There could be others, and we would ignore all others

Check that if you put some other non-kubernetes platform in there that this line doesn't cause a panic. I don't think it will, but just check :)

Done. It now iterates over the platformSpec.Platforms map and adds a check to ensure that only the "kubernetes" platform is processed. It also includes the nil checks for platform and platform.PipelineConfig to prevent nil pointer dereference errors.

gregsheremeta · 2024-11-13T12:30:48Z

backend/src/apiserver/template/v2_template.go

+		if platform.PipelineConfig.SemaphoreKey != "" || platform.PipelineConfig.MutexName != "" {
+			pipeline_options = argocompiler.Options{
+				SemaphoreKey: platform.PipelineConfig.SemaphoreKey,
+				MutexName:    platform.PipelineConfig.MutexName,


Let's add these individually, and only if they are specified in the IR. Users will typically only add one or the other. (So don't add mutex if the user only wants a semaphore, and vice versa.)

Done, added these individually.

gregsheremeta · 2024-11-13T12:32:14Z

backend/src/apiserver/template/v2_template.go

@@ -300,9 +311,20 @@ func (t *V2Spec) RunWorkflow(modelRun *model.Run, options RunWorkflowOptions) (u
 		}
 	}

+	var pipeline_options *argocompiler.Options


this looks copied from above. Is there any way to reuse it?

It is copied, but the context is different.

First one, argocompiler.Options is a non pointer value that represents a copy of the actual data. Second one, *argocompiler.Options is a pointer that stores the memory address of the value.

I thought over how this code could be put into a function and called in twice, without having to reuse it. Here's a pseudo code I was thinking might work:

func setPipelineOptions(platformSpec map[string]*PlatformSpec, pipelineOptions interface{}) { for key, platform := range platformSpec { if key == "kubernetes" && platform != nil && platform.PipelineConfig != nil { if platform.PipelineConfig.SemaphoreKey != "" { SemaphoreKey = platform.PipelineConfig.SemaphoreKey } if platform.PipelineConfig.MutexName != "" { MutexName = platform.PipelineConfig.MutexName } break } } }

and then just calling it each time with the pointer and the non-pointer. But it gets slightly complicated with passing a pointer into a function (needs to be de-referenced).

However, there's a pattern of repetition of the same code between pointer and non-pointer references in this file. For instance here, and here.

Which is why I would suggest keeping this repetition as is, and we can later refactor this file to remove repeated code by putting them into functions. This could be a separate task out of the scope of this PR/task.

backend/src/v2/compiler/argocompiler/argo.go

Signed-off-by: ddalvi <[email protected]>

DharmitD · 2024-11-18T06:51:19Z

/hold until #11384 and #11340 get merged

rimolive · 2024-11-27T14:44:46Z

/lgtm

google-oss-prow bot requested a review from HumairAK November 12, 2024 14:36

google-oss-prow bot requested a review from rimolive November 12, 2024 14:36

google-oss-prow bot added the size/M label Nov 12, 2024

DharmitD changed the title ~~feat(backend): Add Semaphore and Mutex fields to Workflow Spec~~ WIP:feat(backend): Add Semaphore and Mutex fields to Workflow Spec Nov 12, 2024

google-oss-prow bot added the do-not-merge/work-in-progress label Nov 12, 2024

DharmitD force-pushed the sem-mut-backend branch from 5c420db to 449cdda Compare November 13, 2024 05:06

gregsheremeta suggested changes Nov 13, 2024

View reviewed changes

DharmitD changed the title ~~WIP:feat(backend): Add Semaphore and Mutex fields to Workflow Spec~~ WIP:feat(backend): Add Semaphore and Mutex fields to Workflow CR Nov 13, 2024

feat(backend)Add Semaphore and Mutex fields to Workflow Spec

497bc6b

Signed-off-by: ddalvi <[email protected]>

DharmitD force-pushed the sem-mut-backend branch from 449cdda to 497bc6b Compare November 18, 2024 06:13

DharmitD changed the title ~~WIP:feat(backend): Add Semaphore and Mutex fields to Workflow CR~~ feat(backend): Add Semaphore and Mutex fields to Workflow CR Nov 18, 2024

google-oss-prow bot removed the do-not-merge/work-in-progress label Nov 18, 2024

google-oss-prow bot added the do-not-merge/hold label Nov 18, 2024

google-oss-prow bot assigned rimolive Nov 27, 2024

google-oss-prow bot added the lgtm label Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): Add Semaphore and Mutex fields to Workflow CR #11370

feat(backend): Add Semaphore and Mutex fields to Workflow CR #11370

DharmitD commented Nov 12, 2024 •

edited

Loading

google-oss-prow bot commented Nov 12, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta Nov 13, 2024

DharmitD Nov 18, 2024

gregsheremeta Nov 13, 2024

DharmitD Nov 18, 2024

gregsheremeta Nov 13, 2024

DharmitD Nov 18, 2024

DharmitD commented Nov 18, 2024

rimolive commented Nov 27, 2024

feat(backend): Add Semaphore and Mutex fields to Workflow CR #11370

Are you sure you want to change the base?

feat(backend): Add Semaphore and Mutex fields to Workflow CR #11370

Conversation

DharmitD commented Nov 12, 2024 • edited Loading

google-oss-prow bot commented Nov 12, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta commented Nov 13, 2024

gregsheremeta Nov 13, 2024

Choose a reason for hiding this comment

DharmitD Nov 18, 2024

Choose a reason for hiding this comment

gregsheremeta Nov 13, 2024

Choose a reason for hiding this comment

DharmitD Nov 18, 2024

Choose a reason for hiding this comment

gregsheremeta Nov 13, 2024

Choose a reason for hiding this comment

DharmitD Nov 18, 2024

Choose a reason for hiding this comment

DharmitD commented Nov 18, 2024

rimolive commented Nov 27, 2024

DharmitD commented Nov 12, 2024 •

edited

Loading