Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fleet): add control loop and state management #3116

Merged
merged 7 commits into from
Dec 5, 2024

Conversation

TBonnin
Copy link
Collaborator

@TBonnin TBonnin commented Dec 5, 2024

  • Implement Supervisor class to manage node states and transitions
  • Add methods to start, fail, outdate, terminate, and remove nodes
  • Introduce NodeProvider interface for node operations
  • Update node state transitions and error handling
  • Add integration tests for Supervisor functionality

What's not yet implemented:

  • routing logic
  • Render NodeProvider
  • wiring fleet in jobs

There are still a few TODOs that I will address in following PRs

How to test

Still not wired. You can run the tests ;)

@TBonnin TBonnin requested a review from a team December 5, 2024 03:30
Copy link

linear bot commented Dec 5, 2024

import { setTimeout } from 'node:timers/promises';
import type { NodeProvider } from './node-providers/node_provider.js';

type Action =
Copy link
Collaborator Author

@TBonnin TBonnin Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Action is an overloaded term in Nango but I couldn't think of a better name. Suggestions are welcomed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operation is more accurate I feel, but still very generic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. changing for operation

STARTING: 5 * 60 * 1000,
FINISHING: 24 * 60 * 60 * 1000,
TERMINATED: 7 * 24 * 60 * 60 * 1000,
ERROR: 7 * 24 * 60 * 60 * 1000
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arbitrary values. Happy to change

Copy link
Collaborator

@bodinsamuel bodinsamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not going to lie it's a lot of code and a lot of "hypothetical" code, not everything makes sense to me rn. But looks okay overall

packages/fleet/lib/models/nodes.ts Show resolved Hide resolved
packages/fleet/lib/models/nodes.ts Show resolved Hide resolved
packages/fleet/lib/supervisor.integration.test.ts Outdated Show resolved Hide resolved
import { setTimeout } from 'node:timers/promises';
import type { NodeProvider } from './node-providers/node_provider.js';

type Action =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operation is more accurate I feel, but still very generic

packages/fleet/lib/supervisor.ts Outdated Show resolved Hide resolved
expect(failedNode.state).toBe('ERROR');
expect(failedNode.error).toBe('my error');
});
it('should be able to register a node', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you have against line break 🤣

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean empty line between the tests? :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes even in general, between logical code blocks but I guess it's keyboard-driven development :p

packages/fleet/lib/supervisor.ts Show resolved Hide resolved
packages/fleet/lib/supervisor.ts Show resolved Hide resolved
@TBonnin TBonnin force-pushed the tbonnin/nan-2301/fleet-control-loop branch from e5cd89c to 82cb468 Compare December 5, 2024 14:00
@TBonnin TBonnin force-pushed the tbonnin/nan-2301/fleet-init-models branch from 544a1c3 to 0a2ded7 Compare December 5, 2024 14:01
@TBonnin TBonnin force-pushed the tbonnin/nan-2301/fleet-control-loop branch from 82cb468 to 353f8f3 Compare December 5, 2024 14:30
Base automatically changed from tbonnin/nan-2301/fleet-init-models to master December 5, 2024 14:36
Ex:
in case of error, a new node with same routingId/deploymentId will be
created
in case of runner idled because of inactivity, a new node with same
routingId/deploymentId will be created
- Implement Supervisor class to manage node states and transitions
- Add methods to start, fail, outdate, terminate, and remove nodes
- Introduce NodeProvider interface for node operations
- Update node state transitions and error handling
- Add integration tests for Supervisor functionality
@TBonnin TBonnin force-pushed the tbonnin/nan-2301/fleet-control-loop branch from 353f8f3 to 868255c Compare December 5, 2024 16:48
@TBonnin TBonnin merged commit 9eeb271 into master Dec 5, 2024
21 checks passed
@TBonnin TBonnin deleted the tbonnin/nan-2301/fleet-control-loop branch December 5, 2024 17:28
this.state = 'stopped';
}

private async plan(cursor?: number): Promise<Result<Operation[]>> {
Copy link
Contributor

@nalanj nalanj Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier going forward for operations to be an object with a type and a function attached, and then executing the plan just loops and calls the functions?

Copy link
Collaborator Author

@TBonnin TBonnin Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not. The advantage of having values instead of opaque functions is that you can optimize the plan before to execute it. I don't think Render API can do it and even if it does I am not planning to support it but we could for instance takes all the CREATE operations fromthe plan and combine them in a unique call to the node provider instead of one by one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants