Re: [PATCH v2 00/14] hrtimer Rust API
From: Alice Ryhl
Date: Mon Oct 14 2024 - 07:59:13 EST
On Mon, Oct 14, 2024 at 1:53 PM Dirk Behme <dirk.behme@xxxxxxxxx> wrote:
>
> Hi Alice,
>
> On 14.10.24 11:38, Alice Ryhl wrote:
> > On Mon, Oct 14, 2024 at 8:58 AM Dirk Behme <dirk.behme@xxxxxxxxx> wrote:
> >>
> >> On 13.10.24 23:06, Boqun Feng wrote:
> >>> On Sun, Oct 13, 2024 at 07:39:29PM +0200, Dirk Behme wrote:
> >>>> On 13.10.24 00:26, Boqun Feng wrote:
> >>>>> On Sat, Oct 12, 2024 at 09:50:00AM +0200, Dirk Behme wrote:
> >>>>>> On 12.10.24 09:41, Boqun Feng wrote:
> >>>>>>> On Sat, Oct 12, 2024 at 07:19:41AM +0200, Dirk Behme wrote:
> >>>>>>>> On 12.10.24 01:21, Boqun Feng wrote:
> >>>>>>>>> On Fri, Oct 11, 2024 at 05:43:57PM +0200, Dirk Behme wrote:
> >>>>>>>>>> Hi Andreas,
> >>>>>>>>>>
> >>>>>>>>>> Am 11.10.24 um 16:52 schrieb Andreas Hindborg:
> >>>>>>>>>>>
> >>>>>>>>>>> Dirk, thanks for reporting!
> >>>>>>>>>>
> >>>>>>>>>> :)
> >>>>>>>>>>
> >>>>>>>>>>> Boqun Feng <boqun.feng@xxxxxxxxx> writes:
> >>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Oct 01, 2024 at 02:37:46PM +0200, Dirk Behme wrote:
> >>>>>>>>>>>>> On 18.09.2024 00:27, Andreas Hindborg wrote:
> >>>>>>>>>>>>>> Hi!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This series adds support for using the `hrtimer` subsystem from Rust code.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I tried breaking up the code in some smaller patches, hopefully that will
> >>>>>>>>>>>>>> ease the review process a bit.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Just fyi, having all 14 patches applied I get [1] on the first (doctest)
> >>>>>>>>>>>>> Example from hrtimer.rs.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This is from lockdep:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/locking/lockdep.c#n4785
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Having just a quick look I'm not sure what the root cause is. Maybe mutex in
> >>>>>>>>>>>>> interrupt context? Or a more subtle one?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think it's calling mutex inside an interrupt context as shown by the
> >>>>>>>>>>>> callstack:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ] __mutex_lock+0xa0/0xa4
> >>>>>>>>>>>> ] ...
> >>>>>>>>>>>> ] hrtimer_interrupt+0x1d4/0x2ac
> >>>>>>>>>>>>
> >>>>>>>>>>>> , it is because:
> >>>>>>>>>>>>
> >>>>>>>>>>>> +//! struct ArcIntrusiveTimer {
> >>>>>>>>>>>> +//! #[pin]
> >>>>>>>>>>>> +//! timer: Timer<Self>,
> >>>>>>>>>>>> +//! #[pin]
> >>>>>>>>>>>> +//! flag: Mutex<bool>,
> >>>>>>>>>>>> +//! #[pin]
> >>>>>>>>>>>> +//! cond: CondVar,
> >>>>>>>>>>>> +//! }
> >>>>>>>>>>>>
> >>>>>>>>>>>> has a Mutex<bool>, which actually should be a SpinLockIrq [1]. Note that
> >>>>>>>>>>>> irq-off is needed for the lock, because otherwise we will hit a self
> >>>>>>>>>>>> deadlock due to interrupts:
> >>>>>>>>>>>>
> >>>>>>>>>>>> spin_lock(&a);
> >>>>>>>>>>>> > timer interrupt
> >>>>>>>>>>>> spin_lock(&a);
> >>>>>>>>>>>>
> >>>>>>>>>>>> Also notice that the IrqDisabled<'_> token can be simply created by
> >>>>>>>>>>>> ::new(), because irq contexts should guarantee interrupt disabled (i.e.
> >>>>>>>>>>>> we don't support nested interrupts*).
> >>>>>>>>>>>
> >>>>>>>>>>> I updated the example based on the work in [1]. I think we need to
> >>>>>>>>>>> update `CondVar::wait` to support waiting with irq disabled.
> >>>>>>>>>>
> >>>>>>>>>> Yes, I agree. This answers one of the open questions I had in the discussion
> >>>>>>>>>> with Boqun :)
> >>>>>>>>>>
> >>>>>>>>>> What do you think regarding the other open question: In this *special* case
> >>>>>>>>>> here, what do you think to go *without* any lock? I mean the 'while *guard
> >>>>>>>>>> != 5' loop in the main thread is read only regarding guard. So it doesn't
> >>>>>>>>>> matter if it *reads* the old or the new value. And the read/modify/write of
> >>>>>>>>>> guard in the callback is done with interrupts disabled anyhow as it runs in
> >>>>>>>>>> interrupt context. And with this can't be interrupted (excluding nested
> >>>>>>>>>> interrupts). So this modification of guard doesn't need to be protected from
> >>>>>>>>>> being interrupted by a lock if there is no modifcation of guard "outside"
> >>>>>>>>>> the interupt locked context.
> >>>>>>>>>>
> >>>>>>>>>> What do you think?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Reading while there is another CPU is writing is data-race, which is UB.
> >>>>>>>>
> >>>>>>>> Could you help to understand where exactly you see UB in Andreas' 'while
> >>>>>>>> *guard != 5' loop in case no locking is used? As mentioned I'm under the
> >>>>>>>
> >>>>>>> Sure, but could you provide the code of what you mean exactly, if you
> >>>>>>> don't use a lock here, you cannot have a guard. I need to the exact code
> >>>>>>> to point out where the compiler may "mis-compile" (a result of being
> >>> [...]
> >>>>>> I thought we are talking about anything like
> >>>>>>
> >>>>>> #[pin_data]
> >>>>>> struct ArcIntrusiveTimer {
> >>>>>> #[pin]
> >>>>>> timer: Timer<Self>,
> >>>>>> #[pin]
> >>>>>> - flag: SpinLockIrq<u64>,
> >>>>>> + flag: u64,
> >>>>>> #[pin]
> >>>>>> cond: CondVar,
> >>>>>> }
> >>>>>>
> >>>>>> ?
> >>>>>>
> >>>>>
> >>>>> Yes, but have you tried to actually use that for the example from
> >>>>> Andreas? I think you will find that you cannot write to `flag` inside
> >>>>> the timer callback, because you only has a `Arc<ArcIntrusiveTimer>`, so
> >>>>> not mutable reference for `ArcIntrusiveTimer`. You can of course use
> >>>>> unsafe to create a mutable reference to `flag`, but it won't be sound,
> >>>>> since you are getting a mutable reference from an immutable reference.
> >>>>
> >>>> Yes, of course. But, hmm, wouldn't that unsoundness be independent on the
> >>>> topic we discuss here? I mean we are talking about getting the compiler to
> >>>
> >>> What do you mean? If the code is unsound, you won't want to use it in an
> >>> example, right?
> >>
> >> Yes, sure. But ;)
> >>
> >> In a first step I just wanted to answer the question if we do need a
> >> lock at all in this special example. And that we could do even with
> >> unsound read/modify/write I would guess. And then, in a second step,
> >> if the answer would be "we don't need the lock", then we could think
> >> about how to make the flag handling sound. So I'm talking just about
> >> answering that question, not about the final example code. Step by step :)
> >>
> >>
> >>>> read/modify/write 'flag' in the TimerCallback. *How* we tell him to do so
> >>>> should be independent on the result what we want to look at regarding the
> >>>> locking requirements of 'flag'?
> >>>>
> >>>> Anyhow, my root motivation was to simplify Andreas example to not use a lock
> >>>> where not strictly required. And with this make Andreas example independent
> >>>
> >>> Well, if you don't want to use a lock then you need to use atomics,
> >>> otherwise it's likely a UB,
> >>
> >> And here we are back to the initial question :) Why would it be UB
> >> without lock (and atomics)?
> >>
> >> Some (pseudo) assembly:
> >>
> >> Lets start with the main thread:
> >>
> >> ldr x1, [x0]
> >> <work with x1>
> >>
> >> x0 and x1 are registers. x0 contains the address of flag in the main
> >> memory. I.e. that instruction reads (ldr == load) the content of that
> >> memory location (flag) into x1. x1 then contains flag which can be
> >> used then. This is what I mean with "the main thread is read only". If
> >> flag, i.e. x1, does contain the old or new flag value doesn't matter.
> >> I.e. for the read only operation it doesn't matter if it is protected
> >> by a lock as the load (ldr) can't be interrupted.
> >
> > If the compiler generates a single load, then sure.
>
> Yes :)
>
> > But for an
> > unsynchronized load, the compiler may generate two separate load
> > instructions and assume that both loads read the same value.
>
> Ok, yes, if we get this from the compiler I agree that we need the
> lock, even if its just for the read. If I get the chance the next time
> I will try to have a look to the compiler's result to get a better
> idea of this.
Usually I would say that for cases like this, the correct approach is
to use relaxed atomic loads and stores. That compiles down to ordinary
load/store instructions as desired without letting the compiler split
the load.
Alice