-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSS] Handling duplicate calls to provision API #445
Comments
Guard rails are a good idea, I think we should block another provision call if the status is |
The solution looks good to check for the status for the previous provision.
Trying to understand more here. Do we have to build a functionality like what we have for
If the status of the previous provision is
I am inclined towards option 1. |
No, it's not about retries it's merely to catch a race condition.
But relying on the status won't work as once you change it to provisioning... so we really need some other way to "lock" a thread for provisioning. |
Is your feature request related to a problem?
If a user calls the provision API twice using the same workflow, its provisioned resources are duplicated. This can be confusing as the combination of workflow step name and workflow step ID is no longer unique, and choosing the correct resource ID requires guesswork. Also the state changes to COMPLETED when the first workflow finishes, even though more resources are being added.
Additionally this results in the additional copy (or copies) consuming cluster resources.
What solution would you like?
Prevent a workflow from being provisioned if its status is
PROVISIONING
. This status may need to be checked multiple times (like double-checked-locking) to prevent the race condition where two provision calls both seeNOT_STARTED
before one of them begins provisioning.While addressing this, we also should discuss how to handle
FAILED
. We could prevent provisioning and require users to use the deprovision API first (basically only allow provisioning from theNOT_STARTED
state) or we could try to "pick up from where we left off", adding logic to skip provisioining steps if the resource already exists (complex and brittle).What alternatives have you considered?
Leaving the API as is, and/or encouraging use of the create API with the provisioning parameter, which does not have this shortcoming.
Do you have any additional context?
We should probably handle this similarly to trying to create the same Anomaly Detector twice.
The text was updated successfully, but these errors were encountered: