Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scheduled Actions V2] state machine protobufs WIP #6901

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lina-temporal
Copy link
Contributor

What changed?

  • initial protobufs for the V2 scheduler

Why?

  • these are incomplete, and written as-needed in tandem with the implementation. I believe this is a good starting point, as it's the state I need to handle the basic case of buffering scheduled actions.

How did you test it?

  • I didn't

Potential risks

  • New fields in new structs, except for RequestId on BufferedStart. As a completely new field the V1 scheduler won't look at, I don't think there's a risk.

Comment on lines 63 to 64
// Generator is buffering actions.
SCHEDULER_GENERATOR_STATE_BUFFERING = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my review of #6905, you don't need this separate state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will remove.

@@ -47,6 +47,47 @@ enum SchedulerState {
SCHEDULER_STATE_EXECUTING = 2;
}

enum Scheduler2State {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need ScheduleState anymore. It's part of Tianyu's work that we're replacing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I was planning to remove Tianyu's prototype as part of a later PR; I can send it now and fix this up so we don't have the naming conflict.

Comment on lines 71 to 76
// Executor is awaiting actions to be buffered and eligible for execution.
SCHEDULER_EXECUTOR_STATE_WAITING = 1;
// Executor is starting actions.
SCHEDULER_EXECUTOR_STATE_EXECUTING = 2;
// Executor is backing off from executing actions.
SCHEDULER_EXECUTOR_STATE_BACKING_OFF = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need two distinct states for waiting and backing off?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think they can be removed.

Comment on lines 83 to 88
// Backfiller is awaiting backfill to be requested.
SCHEDULER_BACKFILLER_STATE_WAITING = 1;
// Backfiller is actively starting actions.
SCHEDULER_BACKFILLER_STATE_EXECUTING = 2;
// Backfiller is backing off from starting actions.
SCHEDULER_BACKFILLER_STATE_BACKING_OFF = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@@ -48,6 +48,9 @@ message BufferedStart {
temporal.api.enums.v1.ScheduleOverlapPolicy overlap_policy = 3;
// Trigger-immediately or backfill
bool manual = 4;
// An ID generated when the action is buffered for deduplication during
// execution. Only used by the V2 Scheduler (otherwise left empty).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth calling the scheduler "state machine scheduler" instead of v2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's probably a more lasting name for it :)

@@ -155,3 +158,59 @@ message HsmSchedulerState {
google.protobuf.Timestamp next_invocation_time = 3;

}

// V2 Scheduler state
message HsmSchedulerV2State {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove Hsm from the name since it's an implementation detail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also remove the name State from all of these messages to avoid confusion with the enums that are similarly named.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix both

message HsmSchedulerV2State {
temporal.server.api.enums.v1.Scheduler2State state = 1;

// scheduler request parameters and metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please be consistent in docstrings. It's good practice to capitalize first letters in sentences and use punctuation.

string schedule_id = 7;

// Implemented as a sequence number. Useful for substate machines to
// invalidate transactions based on update requests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'd say we're invalidating transactions, more like invaliding "work" or task, right? but also used as an optimistic locking mechanism for concurrent update requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for optimistic locking, yeah; I'm trying to give an example of the specific sort of condition that would bump the token in each different struct. If that's confusing, I can simplify the comment.

Comment on lines 191 to 193
// Implemented as a sequence number. Useful for invalidating a stale
// Generator persisted state write.
int64 conflict_token = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... can we use the conflict token of the scheduler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to avoid making an assumption that we can have transactions across multiple columns/documents in CHASM, since I don't think we have that in HSM today (with MachineTransition operating on a single item). If our framework lets me do a transaction across multiple state machines, we could use only the one in the top-level scheduler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MachineTransition is a single transition within a transaction on the entire tree. You can rely on that here. CHASM will have similar semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - will update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants