Re: [PATCH v2 00/14] hrtimer Rust API

From: Boqun Feng
Date: Sun Oct 13 2024 - 17:06:57 EST


On Sun, Oct 13, 2024 at 07:39:29PM +0200, Dirk Behme wrote:
> On 13.10.24 00:26, Boqun Feng wrote:
> > On Sat, Oct 12, 2024 at 09:50:00AM +0200, Dirk Behme wrote:
> > > On 12.10.24 09:41, Boqun Feng wrote:
> > > > On Sat, Oct 12, 2024 at 07:19:41AM +0200, Dirk Behme wrote:
> > > > > On 12.10.24 01:21, Boqun Feng wrote:
> > > > > > On Fri, Oct 11, 2024 at 05:43:57PM +0200, Dirk Behme wrote:
> > > > > > > Hi Andreas,
> > > > > > >
> > > > > > > Am 11.10.24 um 16:52 schrieb Andreas Hindborg:
> > > > > > > >
> > > > > > > > Dirk, thanks for reporting!
> > > > > > >
> > > > > > > :)
> > > > > > >
> > > > > > > > Boqun Feng <boqun.feng@xxxxxxxxx> writes:
> > > > > > > >
> > > > > > > > > On Tue, Oct 01, 2024 at 02:37:46PM +0200, Dirk Behme wrote:
> > > > > > > > > > On 18.09.2024 00:27, Andreas Hindborg wrote:
> > > > > > > > > > > Hi!
> > > > > > > > > > >
> > > > > > > > > > > This series adds support for using the `hrtimer` subsystem from Rust code.
> > > > > > > > > > >
> > > > > > > > > > > I tried breaking up the code in some smaller patches, hopefully that will
> > > > > > > > > > > ease the review process a bit.
> > > > > > > > > >
> > > > > > > > > > Just fyi, having all 14 patches applied I get [1] on the first (doctest)
> > > > > > > > > > Example from hrtimer.rs.
> > > > > > > > > >
> > > > > > > > > > This is from lockdep:
> > > > > > > > > >
> > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/locking/lockdep.c#n4785
> > > > > > > > > >
> > > > > > > > > > Having just a quick look I'm not sure what the root cause is. Maybe mutex in
> > > > > > > > > > interrupt context? Or a more subtle one?
> > > > > > > > >
> > > > > > > > > I think it's calling mutex inside an interrupt context as shown by the
> > > > > > > > > callstack:
> > > > > > > > >
> > > > > > > > > ] __mutex_lock+0xa0/0xa4
> > > > > > > > > ] ...
> > > > > > > > > ] hrtimer_interrupt+0x1d4/0x2ac
> > > > > > > > >
> > > > > > > > > , it is because:
> > > > > > > > >
> > > > > > > > > +//! struct ArcIntrusiveTimer {
> > > > > > > > > +//! #[pin]
> > > > > > > > > +//! timer: Timer<Self>,
> > > > > > > > > +//! #[pin]
> > > > > > > > > +//! flag: Mutex<bool>,
> > > > > > > > > +//! #[pin]
> > > > > > > > > +//! cond: CondVar,
> > > > > > > > > +//! }
> > > > > > > > >
> > > > > > > > > has a Mutex<bool>, which actually should be a SpinLockIrq [1]. Note that
> > > > > > > > > irq-off is needed for the lock, because otherwise we will hit a self
> > > > > > > > > deadlock due to interrupts:
> > > > > > > > >
> > > > > > > > > spin_lock(&a);
> > > > > > > > > > timer interrupt
> > > > > > > > > spin_lock(&a);
> > > > > > > > >
> > > > > > > > > Also notice that the IrqDisabled<'_> token can be simply created by
> > > > > > > > > ::new(), because irq contexts should guarantee interrupt disabled (i.e.
> > > > > > > > > we don't support nested interrupts*).
> > > > > > > >
> > > > > > > > I updated the example based on the work in [1]. I think we need to
> > > > > > > > update `CondVar::wait` to support waiting with irq disabled.
> > > > > > >
> > > > > > > Yes, I agree. This answers one of the open questions I had in the discussion
> > > > > > > with Boqun :)
> > > > > > >
> > > > > > > What do you think regarding the other open question: In this *special* case
> > > > > > > here, what do you think to go *without* any lock? I mean the 'while *guard
> > > > > > > != 5' loop in the main thread is read only regarding guard. So it doesn't
> > > > > > > matter if it *reads* the old or the new value. And the read/modify/write of
> > > > > > > guard in the callback is done with interrupts disabled anyhow as it runs in
> > > > > > > interrupt context. And with this can't be interrupted (excluding nested
> > > > > > > interrupts). So this modification of guard doesn't need to be protected from
> > > > > > > being interrupted by a lock if there is no modifcation of guard "outside"
> > > > > > > the interupt locked context.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > >
> > > > > > Reading while there is another CPU is writing is data-race, which is UB.
> > > > >
> > > > > Could you help to understand where exactly you see UB in Andreas' 'while
> > > > > *guard != 5' loop in case no locking is used? As mentioned I'm under the
> > > >
> > > > Sure, but could you provide the code of what you mean exactly, if you
> > > > don't use a lock here, you cannot have a guard. I need to the exact code
> > > > to point out where the compiler may "mis-compile" (a result of being
[...]
> > > I thought we are talking about anything like
> > >
> > > #[pin_data]
> > > struct ArcIntrusiveTimer {
> > > #[pin]
> > > timer: Timer<Self>,
> > > #[pin]
> > > - flag: SpinLockIrq<u64>,
> > > + flag: u64,
> > > #[pin]
> > > cond: CondVar,
> > > }
> > >
> > > ?
> > >
> >
> > Yes, but have you tried to actually use that for the example from
> > Andreas? I think you will find that you cannot write to `flag` inside
> > the timer callback, because you only has a `Arc<ArcIntrusiveTimer>`, so
> > not mutable reference for `ArcIntrusiveTimer`. You can of course use
> > unsafe to create a mutable reference to `flag`, but it won't be sound,
> > since you are getting a mutable reference from an immutable reference.
>
> Yes, of course. But, hmm, wouldn't that unsoundness be independent on the
> topic we discuss here? I mean we are talking about getting the compiler to

What do you mean? If the code is unsound, you won't want to use it in an
example, right?

> read/modify/write 'flag' in the TimerCallback. *How* we tell him to do so
> should be independent on the result what we want to look at regarding the
> locking requirements of 'flag'?
>
> Anyhow, my root motivation was to simplify Andreas example to not use a lock
> where not strictly required. And with this make Andreas example independent

Well, if you don't want to use a lock then you need to use atomics,
otherwise it's likely a UB, but atomics are still WIP, so that why I
suggested Andreas to use a lock first. But I guess I didn't realise the
lock needs to be irq-safe when I suggested that.

Regards,
Boqun

> on mutex lockdep issues, SpinLockIrq changes and possible required CondVar
> updates. But maybe we find an other way to simplify it and decrease the
> dependencies. In the end its just example code ;)
>
> Best regards
>
> Dirk
>
>
> > Regards,
> > Boqun
> >
[...]