Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin making Accessor interface more robust #548

Closed
wants to merge 3 commits into from

Conversation

pfmooney
Copy link
Collaborator

This fixes leaky (and broken) behavior for when nodes are dropped an should be removed from the hierarchy. A small base of tests have been added to cover a few of those cases.

The ability to disable access for swaths of the hierarchy is still missing, but this should lay the groundwork for more easily adding that in the future.

FYI: The coming overhaul of the block IO interfaces depends on this work.

This fixes leaky (and broken) behavior for when nodes are dropped an
should be removed from the hierarchy.  A small base of tests have been
added to cover a few of those cases.

The ability to disable access for swaths of the hierarchy is still
missing, but this should lay the groundwork for more easily adding that
in the future.
Comment on lines -270 to -271
let acc_mem = machine.acc_mem.child();
let acc_msi = machine.acc_msi.child();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer(s) please note: Creating these children this way was a bug! The bus(es) created in the topology spawned their own children from the local acc_mem and acc_msi. Those locals are dropped when the function returns, effectively orphaning those buses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again for my edification: this would break in the new code because dropping the locals would orphan the children (so that all further attempts to obtain guards will fail)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, precisely. (Ask me how I know)

@gjcolombo gjcolombo self-requested a review October 11, 2023 15:34
Copy link
Contributor

@gjcolombo gjcolombo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM with a couple of small suggestions and for-my-understanding questions, and then one bigger question about the future semantics of obtaining a guard and revoking access to a device while guards are active.

Comment on lines +393 to +396
drop(child_tguard);
drop(child_ent);
drop(parent_tguard);
drop(parent_ent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drops threw me at first but then I scrolled down to the tests and saw what they're for. Might be good to have a clarifying comment here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


let mut queue = VecDeque::new();
queue.push_back((ID_ROOT, leaf));
queue.push_back((ID_ROOT, leaf_ent.id));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider asserting here that adopt_root.id == ID_ROOT--I think the caller already asserts this, but this is where the invariant really matters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done

pub fn poison(&self) -> Option<Arc<T>> {
self.0.poison()
}

/// Attempt to gain access to the underlying resource.
///
/// Will return [None] if any ancestor node disables access, or if the node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I'm clear on the semantics here: if I call access and get back a guard, and then someone else calls poison, my guard is still designed to be valid, correct?

I don't think there's anything concrete to change in this PR, but this kind of thing makes me nervous about bugs along the lines of "the guest did a config write that disabled access to resource X, we processed that and returned to the guest, but whoops it turns out some other vCPU already held a reference to X". It's not memory-unsafe because the guards get a strong ref to the resource, but it might be semantically unsafe depending on what it means to poison a resource tree or disable access to it. WDYT? Is this something that's likely to come up or am I just casting FUD?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Poison doesn't come with strong guarantees around ensuring that no nodes have outstanding guards. When I return to this area to actually wire up the access control (for implementing things like disabling bus-mastering in PCI), I'll be taking a closer look at what kind of guarantees we do want to make.

As part of that, I'd also like to investigate using the Accessor stuff to streamline access to things like the PIO/MMIO busses as well. Such things are protected by centralized locks today (requiring cloning of the inside resources in order to access them), but I think smarts in the Accessor stuff allow for lock-less/alloc-less access if we're willing to accept a higher cost when changing the MMIO/PIO/etc map.

Comment on lines -270 to -271
let acc_mem = machine.acc_mem.child();
let acc_msi = machine.acc_msi.child();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again for my edification: this would break in the new code because dropping the locals would orphan the children (so that all further attempts to obtain guards will fail)?

sub.push(rsub);
}

right.print(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this assert that the output is something we expect?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shrug Nothing about the output is really committed at this point. I added a test case just so I could snag the output without actually editing in a print call into the source.

assert_eq!(guard.load(Ordering::Relaxed), tval);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One related test case that I don't see here (maybe I overlooked it) is orphanage of one node not affecting any of its siblings. I don't know why that wouldn't work, but probably not bad coverage to have since that's the common scenario for bus connections.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Let me know if the added test lives up to what you were expecting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@pfmooney
Copy link
Collaborator Author

Merged as 9808671

@pfmooney pfmooney closed this Oct 12, 2023
@pfmooney pfmooney deleted the acc-fixes branch October 12, 2023 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants