Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch back to Counter based SpinLockIRQ #534

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

kitlith
Copy link
Contributor

@kitlith kitlith commented Oct 31, 2019

Fixes #190. Fixes #192.

Why this doesn't suck

Interrupt handler wrappers are now generated by a macro, so we have a single place to inject code into interrupt handlers. This allows us to tell SpinLockIRQ that interrupts are already disabled by incrementing the counter and allows us to decrement the counter again (without disabling interrupts yet) right before entering userspace again.

This also contains a fix for a mistake in #528, because I failed to copy-paste some checks into try_lock. >_> If this PR doesn't go through, then that should still be fixed anyway.

It still deadlocks later on... :/
This was interesting to figure out.

In short, when a scheduler based process switch occurs, there is a lock
held exclusively for enabling and disabling interrupts. This lock is not
dropped before the process switch.

If this is not the first time this process is run, it evens out by
dropping the copy of this lock held by the 'new' process. However, if it
is *not* the first time the process is run, there is no lock, and the
counter never gets decremented.

This commit fixes the issue by decrementing the counter in the new
process right before the iret which reenabled interrupts.
Forgot to copy paste after fixing issues in lock.

Doesn't seem to be used early enough to matter, but in case it ever
is...
@todo
Copy link

todo bot commented Oct 31, 2019

Disable preemption.

// TODO: Disable preemption.
// TODO: Spin acquire
// lock
let internalguard = self.internal.lock();
SpinLockIRQGuard(ManuallyDrop::new(internalguard))
}
/// Disables interrupts and locks the mutex.
pub fn try_lock(&self) -> Option<SpinLockIRQGuard<'_, T>> {
// Disable irqs


This comment was generated by todo based on a TODO comment in 495cb50 in #534. cc @kitlith.

@todo
Copy link

todo bot commented Oct 31, 2019

Spin acquire

// TODO: Spin acquire
// lock
let internalguard = self.internal.lock();
SpinLockIRQGuard(ManuallyDrop::new(internalguard))
}
/// Disables interrupts and locks the mutex.
pub fn try_lock(&self) -> Option<SpinLockIRQGuard<'_, T>> {
// Disable irqs
unsafe { disable_interrupts(); }


This comment was generated by todo based on a TODO comment in 495cb50 in #534. cc @kitlith.

@todo
Copy link

todo bot commented Oct 31, 2019

Disable preemption.

// TODO: Disable preemption.
// TODO: Spin acquire
// lock
match self.internal.try_lock() {
Some(internalguard) => Some(SpinLockIRQGuard(ManuallyDrop::new(internalguard))),
None => {
// We couldn't lock. Restore irqs and return None
unsafe { enable_interrupts(); }
None
}


This comment was generated by todo based on a TODO comment in 495cb50 in #534. cc @kitlith.

@todo
Copy link

todo bot commented Oct 31, 2019

Spin acquire

// TODO: Spin acquire
// lock
match self.internal.try_lock() {
Some(internalguard) => Some(SpinLockIRQGuard(ManuallyDrop::new(internalguard))),
None => {
// We couldn't lock. Restore irqs and return None
unsafe { enable_interrupts(); }
None
}
}


This comment was generated by todo based on a TODO comment in 495cb50 in #534. cc @kitlith.

@@ -592,6 +594,7 @@ macro_rules! generate_trap_gate_handler {

if $interrupt_context {
let _ = INSIDE_INTERRUPT_COUNT.fetch_sub(1, Ordering::SeqCst);
unsafe { decrement_lock_count(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsafe needs to have a // Safety comment, like this

@@ -347,6 +347,8 @@ fn jump_to_entrypoint(ep: usize, userspace_stack_ptr: usize, arg1: usize, arg2:
const_assert_eq!((GdtIndex::UTlsRegion as u16) << 3 | 0b11, 0x3B);
const_assert_eq!((GdtIndex::UTlsElf as u16) << 3 | 0b11, 0x43);
const_assert_eq!((GdtIndex::UStack as u16) << 3 | 0b11, 0x4B);
// We're about to enable interrupts by doing an iret.
unsafe { crate::sync::spin_lock_irq::decrement_lock_count(); }
Copy link
Member

@roblabla roblabla Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDEM, needs a // Safety comment explaining why this operation is safe.

Also, I'm a bit confused at why this is necessary? Is this what was talked about on discord, because of the interrupt_manager SpinLockIRQ? If so:

  1. It should definitely have a big fat comment explaining this in detail
  2. I think we can get rid of interrupt_manager, removing the need for this.

Copy link
Contributor Author

@kitlith kitlith Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is interrupt_manager necessary?

the guard for the queue is dropped before the context switch (if it wasn't, that'd probably cause deadlocks or cause need to force_unlock, etc.), which must still have interrupts disabled. So it's not removable as such.

Alternatives to what I did do include wrapping context_switch in decrements/increments of the counter and/or removing the guard object and using calls to increment/decrement. I think it's a bit better to associate it with the iret, but shrug.

Copy link
Contributor Author

@kitlith kitlith Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, here's another alternative: making context_switch and co less unsafe by having it manage interrupt enabling/disabling by itself instead of making everything else manage it.

if self.1 {
unsafe { interrupts::sti(); }
}
unsafe { enable_interrupts(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the old code didn't have one, but a safety comment here would be nice ^^.

Some(internalguard) => Some(SpinLockIRQGuard(ManuallyDrop::new(internalguard))),
None => {
// We couldn't lock. Restore irqs and return None
unsafe { enable_interrupts(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment

}
pub fn try_lock(&self) -> Option<SpinLockIRQGuard<'_, T>> {
// Disable irqs
unsafe { disable_interrupts(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment

}
pub fn lock(&self) -> SpinLockIRQGuard<'_, T> {
// Disable irqs
unsafe { disable_interrupts(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment

Copy link
Member

@roblabla roblabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, needs a few changes.

I think disable_interrupts/enable_interrupts don't actually need to be wrapped in unsafe ?

/// Decrement the interrupt disable counter without possibly re-enabling interrupts.
///
/// Used to decrement counter while keeping interrupts disabled inside an interrupt.
/// Look at documentation for INTERRUPT_DISABLE_COUNTER to know more.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment explaining when it is safe to call this function.

@kitlith
Copy link
Contributor Author

kitlith commented Oct 31, 2019

They don't, per-se. Even the one which I have marked unsafe consists of entirely safe code. (I haven't marked the other two as unsafe because they were previously safe, but I am willing to change that.)

However, I consider calling them unsafe, as manually incrementing/decrementing the counter at the wrong time is going to break the counter and cause deadlocks.

Could possibly make it less fragile by disabling interrupts every time disable_interrupts is called, but that probably means that stuff will just break in a different, non-obvious way.

@todo
Copy link

todo bot commented Oct 31, 2019

Is it valid for interrupts to be active here?

The interrupt wrappers call decrement_lock_count before calling this function, because it has the possibility to never return. Upon a spurious wake up, interrupts will end up being enabled.


// TODO: Is it valid for interrupts to be active here?
// BODY: The interrupt wrappers call decrement_lock_count before calling this function,
// BODY: because it has the possibility to never return.
// BODY: Upon a spurious wake up, interrupts will end up being enabled.
let _ = scheduler::unschedule(&lock, lock.lock());
}
}


This comment was generated by todo based on a TODO comment in c62dc63 in #534. cc @kitlith.

@roblabla
Copy link
Member

roblabla commented Nov 1, 2019

The safety comments on the unsafe fn should be modeled like clippy asks (clippy version used in SunriseOS doesn't have that lint right now, but we'll update soon-ish). E.G. it should be a # Safety section in the markdown (and not # Unsafety).

As for the // Safety comments, they should go inside the unsafe block. I'm planning on making a tool that looks for unsafe blocks and checks that they contain a safety comment ^^'.

Looks like when I went looking for examples, I found the wrong example.
It's funny, I could've sworn there was already some tool for this, but
couldn't find it when I went to look for it. *shrug*
Copy link
Member

@roblabla roblabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! This feels good. Just have a couple nits on the documentation front. I'll merge it once those are included!

Thanks a lot for this 😄

kernel/src/i386/interrupt_service_routines.rs Outdated Show resolved Hide resolved
kernel/src/i386/interrupt_service_routines.rs Outdated Show resolved Hide resolved
Co-Authored-By: Robin Lambertz <[email protected]>
@kitlith kitlith force-pushed the recursive_irq_guard branch from 4bd1401 to 9ebe582 Compare November 9, 2019 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find sane design for SpinLockIRQ safety SpinLockIRQ is totally broken
2 participants