Re: [PATCH v3 1/3] rust: Introduce irq module

From: Benno Lossin
Date: Thu Aug 15 2024 - 17:47:19 EST

On 15.08.24 23:31, Lyude Paul wrote:
> On Thu, 2024-08-15 at 17:05 -0400, Lyude Paul wrote:
>> The type system approach is slightly more complicated, but I'm now realizing
>> it is probably the correct solution actually. Thanks for pointing that out!
>> So: Functions like wait_event_lock_interruptible_irq() work because they drop
>> the spinlock in question before re-enabling interrupts, then re-disable
>> interrupts and re-acquire the lock before checking the condition. This is
>> where a soundness issue with my current series lies.
>> For the sake of explanation, let's pretend we have an imaginary rust function
>> "irqs_on_and_sleep(irq: IrqDisabled<'_>)" that re-enables IRQs explicitly,
>> sleeps, then turns them back on. This leads to a soundness issue if we have
>> IrqDisabled be `Copy`:
>> with_irqs_disabled(|irq| {
>> let some_guard = some_spinlockirq.lock_with(irq);
>> // ^ Let's call this type Guard<'1, …>
>> irqs_on_and_sleep(irq);
>> // ^ because `irq` is just copied here, the lifetime '1 doesn't end here.
>> // Since we re-enabled interrupts while holding a SpinLockIrq, we would
>> // potentially deadlock here.
>> some_function(some_guard.some_data);
>> });
>> So - I'm thinking we might want to make it so that IrqDisabled does not have
>> `Copy` - and that resources acquired with it should share the lifetime of an
>> immutable reference to it. Let's now pretend `.lock_with()` takes an &'1
>> IrqDisabled, and the irqs_on_and_sleep() function from before returns an
>> IrqDisabled.
>> with_irqs_disabled(|irq| { // <- still passed by value here
>> let some_guard = some_spinlockirq.lock_with(&irq); // <- Guard<'1, …>
>> let irq = irqs_on_and_sleep(irq); // The lifetime of '1 ends here
>> some_function(some_guard.some_data);
>> // Success! ^ this fails to compile, as '1 no longer lives long enough
>> // for the guard to still be usable.
>> // Deadlock averted :)
>> )}
>> Then if we were to add bindings for things like
>> wait_event_lock_interruptible_irq() - we could have those take both the
>> IrqDisabled token and the Guard<'1, …> by value - and then return them
>> afterwards. Which I believe would fix the soundness issue :)
>> How does that sound to everyone?
> I should note though - after thinking about this for a moment, I realized that
> there are still some issues with this. For instance: Since
> with_irqs_disabled() can still be nested, a nested with_irqs_disabled() call
> could create another IrqDisabled with its own lifetime - and thus we wouldn't
> be able to do this same lifetime trick with any resources acquired outside the
> nested call.
> Granted - we -do- still have lockdep for this, so in such a situation with a
> lockdep-enabled kernel we would certainly get a warning when this happens. I
> think one option we might have if we wanted to go a bit further with safety
> here: maybe we could do something like this:
> pub fn with_irqs_disabled<T>(cb: impl for<'a> FnOnce(IrqDisabled<'a>) -> T) -> T {
> // With this function, we would assert that IRQs are not enabled at the start
> }
> (I am a bit new to HRTBs, so the syntax here might not be right - but
> hopefully you can still follow what I mean)
> pub fn with_nested_irqs_disabled<T>(
> irq: impl for<'a> Option<&'a mut IrqDisabled<'a>>,

This doesn't make sense, since `impl` can only be used on traits and
`Option` is not a trait.

> cb: impl for<'a> FnOnce(IrqDisabled<'a>) -> T,
> ) -> T {
> // With this function, we would assert that IRQs are disabled
> // if irq.is_some(), otherwise we would assert they're disabled
> // Since we require a mutable reference, this would still invalidate any
> // borrows which rely on the previous IrqDisabled token
> }

I don't see the utility of this, if you already have an `IrqDisabled`,
then you don't need to call `with_irqs_disabled`. If you don't have one,
irqs still might be disabled, but you don't know.

> Granted - I have no idea how ergonomic something like this would be since on
> the C side of things: we don't really require that the user know the prior IRQ
> state for things like irqsave/irqrestore functions.

I think ergonomically, this is a bad idea, since it will infect a lot of
functions that don't care about IRQ.
