Re: [PATCH v2 00/14] hrtimer Rust API

From: Dirk Behme
Date: Sat Oct 12 2024 - 03:50:14 EST


On 12.10.24 09:41, Boqun Feng wrote:
On Sat, Oct 12, 2024 at 07:19:41AM +0200, Dirk Behme wrote:
On 12.10.24 01:21, Boqun Feng wrote:
On Fri, Oct 11, 2024 at 05:43:57PM +0200, Dirk Behme wrote:
Hi Andreas,

Am 11.10.24 um 16:52 schrieb Andreas Hindborg:

Dirk, thanks for reporting!

:)

Boqun Feng <boqun.feng@xxxxxxxxx> writes:

On Tue, Oct 01, 2024 at 02:37:46PM +0200, Dirk Behme wrote:
On 18.09.2024 00:27, Andreas Hindborg wrote:
Hi!

This series adds support for using the `hrtimer` subsystem from Rust code.

I tried breaking up the code in some smaller patches, hopefully that will
ease the review process a bit.

Just fyi, having all 14 patches applied I get [1] on the first (doctest)
Example from hrtimer.rs.

This is from lockdep:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/locking/lockdep.c#n4785

Having just a quick look I'm not sure what the root cause is. Maybe mutex in
interrupt context? Or a more subtle one?

I think it's calling mutex inside an interrupt context as shown by the
callstack:

] __mutex_lock+0xa0/0xa4
] ...
] hrtimer_interrupt+0x1d4/0x2ac

, it is because:

+//! struct ArcIntrusiveTimer {
+//! #[pin]
+//! timer: Timer<Self>,
+//! #[pin]
+//! flag: Mutex<bool>,
+//! #[pin]
+//! cond: CondVar,
+//! }

has a Mutex<bool>, which actually should be a SpinLockIrq [1]. Note that
irq-off is needed for the lock, because otherwise we will hit a self
deadlock due to interrupts:

spin_lock(&a);
> timer interrupt
spin_lock(&a);

Also notice that the IrqDisabled<'_> token can be simply created by
::new(), because irq contexts should guarantee interrupt disabled (i.e.
we don't support nested interrupts*).

I updated the example based on the work in [1]. I think we need to
update `CondVar::wait` to support waiting with irq disabled.

Yes, I agree. This answers one of the open questions I had in the discussion
with Boqun :)

What do you think regarding the other open question: In this *special* case
here, what do you think to go *without* any lock? I mean the 'while *guard
!= 5' loop in the main thread is read only regarding guard. So it doesn't
matter if it *reads* the old or the new value. And the read/modify/write of
guard in the callback is done with interrupts disabled anyhow as it runs in
interrupt context. And with this can't be interrupted (excluding nested
interrupts). So this modification of guard doesn't need to be protected from
being interrupted by a lock if there is no modifcation of guard "outside"
the interupt locked context.

What do you think?


Reading while there is another CPU is writing is data-race, which is UB.

Could you help to understand where exactly you see UB in Andreas' 'while
*guard != 5' loop in case no locking is used? As mentioned I'm under the

Sure, but could you provide the code of what you mean exactly, if you
don't use a lock here, you cannot have a guard. I need to the exact code
to point out where the compiler may "mis-compile" (a result of being
UB).


I thought we are talking about anything like

#[pin_data]
struct ArcIntrusiveTimer {
#[pin]
timer: Timer<Self>,
#[pin]
- flag: SpinLockIrq<u64>,
+ flag: u64,
#[pin]
cond: CondVar,
}

?

Best regards

Dirk

impression that it doesn't matter if the old or new guard value is read in
this special case.


For one thing, if the compiler believes no one is accessing the value
because the code uses an immutable reference, it can "optimize" the loop
away:

while *var != 5 {
do_something();
}

into

if *var != 5 {
loop { do_something(); }
}

But as I said, I need to see the exact code to suggest a relevant
mis-compile, and note that sometimes, even mis-compile seems impossible
at the moment, a UB is a UB, compilers are free to do anything they
want (or don't want). So "mis-compile" is only helping we understand the
potential result of a UB.

Regards,
Boqun

Best regards

Dirk


Regards,
Boqun

Thanks

Dirk


Without
this, when we get back from `bindings::schedule_timeout` in
`CondVar::wait_internal`, interrupts are enabled:

```rust
use kernel::{
hrtimer::{Timer, TimerCallback, TimerPointer, TimerRestart},
impl_has_timer, new_condvar, new_spinlock, new_spinlock_irq,
irq::IrqDisabled,
prelude::*,
sync::{Arc, ArcBorrow, CondVar, SpinLock, SpinLockIrq},
time::Ktime,
};

#[pin_data]
struct ArcIntrusiveTimer {
#[pin]
timer: Timer<Self>,
#[pin]
flag: SpinLockIrq<u64>,
#[pin]
cond: CondVar,
}

impl ArcIntrusiveTimer {
fn new() -> impl PinInit<Self, kernel::error::Error> {
try_pin_init!(Self {
timer <- Timer::new(),
flag <- new_spinlock_irq!(0),
cond <- new_condvar!(),
})
}
}

impl TimerCallback for ArcIntrusiveTimer {
type CallbackTarget<'a> = Arc<Self>;
type CallbackTargetParameter<'a> = ArcBorrow<'a, Self>;

fn run(this: Self::CallbackTargetParameter<'_>, irq: IrqDisabled<'_>) -> TimerRestart {
pr_info!("Timer called\n");
let mut guard = this.flag.lock_with(irq);
*guard += 1;
this.cond.notify_all();
if *guard == 5 {
TimerRestart::NoRestart
}
else {
TimerRestart::Restart

}
}
}

impl_has_timer! {
impl HasTimer<Self> for ArcIntrusiveTimer { self.timer }
}


let has_timer = Arc::pin_init(ArcIntrusiveTimer::new(), GFP_KERNEL)?;
let _handle = has_timer.clone().schedule(Ktime::from_ns(200_000_000));

kernel::irq::with_irqs_disabled(|irq| {
let mut guard = has_timer.flag.lock_with(irq);

while *guard != 5 {
pr_info!("Not 5 yet, waiting\n");
has_timer.cond.wait(&mut guard); // <-- we arrive back here with interrupts enabled!
}
});
```

I think an update of `CondVar::wait` should be part of the patch set [1].


Best regards,
Andreas


[1] https://lore.kernel.org/rust-for-linux/20240916213025.477225-1-lyude@xxxxxxxxxx/