Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch operation rate limit #366

Merged
merged 1 commit into from
Oct 30, 2023
Merged

Batch operation rate limit #366

merged 1 commit into from
Oct 30, 2023

Conversation

stephanos
Copy link
Contributor

@stephanos stephanos commented Oct 19, 2023

What was changed

Added a new flag --rps to the 4 batch operations.

Note that although batch-reset is a batch operation, it performs its work locally instead of on the server - so the flag does not apply here.

Why?

To fix temporalio/temporal#4926.

Checklist

  1. Closes OSS-1681

  2. How was this tested:

  1. Any docs updates needed? see Batch operation rate limit documentation#2413

@stephanos stephanos force-pushed the batch-operation-rate-limit branch 4 times, most recently from 09e85ca to 20ef19e Compare October 30, 2023 16:35
@stephanos stephanos force-pushed the batch-operation-rate-limit branch from 20ef19e to f1c1d96 Compare October 30, 2023 16:36
FlagQueryTerminate = "Terminate Workflow Executions with given List Filter."
FlagEventIDDefinition = "The Event Id for any Event after WorkflowTaskStarted you want to reset to (exclusive). It can be WorkflowTaskCompleted, WorkflowTaskFailed or others."
FlagQueryResetBatch = "Visibility Query of Search Attributes describing the Workflow Executions to reset. See https://docs.temporal.io/docs/tctl/workflow/list#--query."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To align it with the other flags, I changed this into FlagQueryReset and standardized the text.

@@ -83,8 +83,10 @@ func ListBatchJobs(c *cli.Context) error {
// BatchTerminate terminate a list of workflows
func BatchTerminate(c *cli.Context) error {
operator := common.GetCurrentUserFromEnv()
rps := float32(c.Float64(common.FlagRPS))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, the CLI library doesn't have a built-in Float32 flag, so we need to downcast here. Which is fine; the required precision we expect here will most likely never be more than 2 decimals.

@@ -16,24 +16,24 @@ require (
github.com/temporalio/tctl-kit v0.0.0-20230328153839-577f95d16fa0
github.com/temporalio/ui-server/v2 v2.18.2
github.com/urfave/cli/v2 v2.25.7
go.temporal.io/api v1.24.0
go.temporal.io/api v1.25.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgraded to new API version to pull in the max_operations_per_second field.

@stephanos stephanos marked this pull request as ready for review October 30, 2023 18:53
@stephanos stephanos merged commit 700234c into main Oct 30, 2023
16 checks passed
@stephanos stephanos deleted the batch-operation-rate-limit branch October 30, 2023 19:54
stephanos added a commit to temporalio/temporal that referenced this pull request Oct 30, 2023
<!-- Describe what has changed in this PR -->
**What changed?**

Allow rate limiting a batch operation. Fixes
#4926.

<!-- Tell your future self why have you made these changes -->
**Why?**

Batch operations are run server side and may effect millions of
executions, this in turn may overload workers and disrupt normal
operations.

<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**

I started the Server locally and initiated a batch operation from the
CLI ([CLI changes be found
here](temporalio/cli#366)):

- [x] uses provided limit
- [x] uses server limit when not provided
- [x] caps it at server limit 

<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**

Rate limiting to be incorrect and slow down processing or disrupt
cluster.

<!-- Is this PR a hotfix candidate or require that a notification be
sent to the broader community? (Yes/No) -->
**Is hotfix candidate?**

No
stephanos added a commit that referenced this pull request Nov 6, 2023
stephanos added a commit that referenced this pull request Nov 6, 2023
stephanos added a commit that referenced this pull request Nov 6, 2023
Reverts #366 - but leaving in the dependency updates.

It was merged too early; I'll bring it back when the new Server release
is available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a rate limit option for batch operations
2 participants