-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update links to API code about hard-coded allowed transitions #2387
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
app/api/util.ts
Outdated
|
||
// TODO: ask greg about this | ||
// reboot and stop are kind of weird! in theory you can do it from a bunch of different states | ||
// https://github.com/oxidecomputer/omicron/blob/6dd9802/nexus/src/app/instance.rs#L790-L798 | ||
reboot: ['running'], | ||
stop: ['running', 'starting'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gjcolombo — Charlie and I are double-checking the set of instance states where reboot and stop are allowed and were wondering how to interpret this function.
https://github.com/oxidecomputer/omicron/blob/6dd9802/nexus/src/app/instance.rs#L790-L798
I see there's different logic depending on whether a VMM is found, but I don't know how to think about that from a user point of view. I see that there a bunch of states where you should never fail to find a VMM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That match is pretty spooky, especially in light of some recent improvements to Nexus (omicron#5749 and omicron#5854). I filed oxidecomputer/omicron#6390 to see if we can clean it up a bit.
For purposes of this PR I almost think you want to link both to Nexus and to this bit of Propolis: https://github.com/oxidecomputer/propolis/blob/b278193/bin/propolis-server/src/lib/vm/request_queue.rs
The idea underlying the code you linked is that when an instance has a running Propolis--roughly between when you successfully ask to start the instance and the time that Propolis comes back and says it's Destroyed--the Propolis process is the source of truth as to what state changes can and can't be accepted. The match is saying, "OK, I have this request to change the instance's state, and I have a Propolis to talk to; given what I know about the state that Propolis last reported, should I even bother forwarding it this request to get its disposition?"
For requests to run, stop, and reboot, the answer is basically always "let Propolis decide what to do." (The main edge case is Migrating: which Propolis gets the request then? I think the simplest answer is "neither, the request should be rejected until we know where to send it"; omicron#6390 tracks this.)
The migration-request side of the match could also use some tweaking, but the external API doesn't allow explicit requests to migrate anymore (omicron#6311), so that's probably not as much of a concern here.
On the console side, I think the logic you have for these states (right side lines 102/103) is right, though we could also allow a request to stop a rebooting instance (Propolis will queue it), or even reboot a rebooting instance (Propolis will queue the second reboot to execute when the first is done; note though that once you've queued one request to reboot, Propolis will silently drop new ones until the queued request begins to be processed).
Hope this helps a bit--LMK what else I can clarify or fill you in on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's perfect, thanks. Generally I'm ok with being slightly more restrictive than what's technically possible, though in the UI we sometimes say "an instance can only be rebooted in X states", and in that case I'd prefer to be accurate since we're educating the user there about the system. So I think I'll add rebooting to the lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically correct: the best kind of correct!
(Yeah, that looks pretty silly. Leaving rebooting off the lists is the right move.)
cc6b570
to
f9cb0d8
Compare
oxidecomputer/console@33b7a50...8dcddce * [8dcddcef](oxidecomputer/console@8dcddcef) oxidecomputer/console#2392 * [f53bb383](oxidecomputer/console@f53bb383) oxidecomputer/console#2387 * [6a005a0b](oxidecomputer/console@6a005a0b) oxidecomputer/console#2389 * [2f83a1fb](oxidecomputer/console@2f83a1fb) tweak firewall rule copy (closes oxidecomputer/console#2348) * [22bdac78](oxidecomputer/console@22bdac78) oxidecomputer/console#2386 * [5353a6ea](oxidecomputer/console@5353a6ea) oxidecomputer/console#2385 * [7cf1ed54](oxidecomputer/console@7cf1ed54) oxidecomputer/console#2383 * [c52cc37b](oxidecomputer/console@c52cc37b) oxidecomputer/console#2382 * [e442f6bd](oxidecomputer/console@e442f6bd) oxidecomputer/console#2378
oxidecomputer/console@33b7a50...8dcddce * [8dcddcef](oxidecomputer/console@8dcddcef) oxidecomputer/console#2392 * [f53bb383](oxidecomputer/console@f53bb383) oxidecomputer/console#2387 * [6a005a0b](oxidecomputer/console@6a005a0b) oxidecomputer/console#2389 * [2f83a1fb](oxidecomputer/console@2f83a1fb) tweak firewall rule copy (closes oxidecomputer/console#2348) * [22bdac78](oxidecomputer/console@22bdac78) oxidecomputer/console#2386 * [5353a6ea](oxidecomputer/console@5353a6ea) oxidecomputer/console#2385 * [7cf1ed54](oxidecomputer/console@7cf1ed54) oxidecomputer/console#2383 * [c52cc37b](oxidecomputer/console@c52cc37b) oxidecomputer/console#2382 * [e442f6bd](oxidecomputer/console@e442f6bd) oxidecomputer/console#2378
Closes #2334