Re: [PATCH RFC 00/11] rust: Implicit lock class creation & Arc Lockdep integration

From: Boqun Feng
Date: Fri Jul 14 2023 - 11:21:20 EST


On Fri, Jul 14, 2023 at 01:59:26PM +0000, Alice Ryhl wrote:
> Asahi Lina <lina@xxxxxxxxxxxxx> writes:
> > On 14/07/2023 19.13, Alice Ryhl wrote:
> > > Asahi Lina <lina@xxxxxxxxxxxxx> writes:
> > >> Begone, lock classes!
> > >>
> > >> As discussed in meetings/etc, we would really like to support implicit
> > >> lock class creation for Rust code. Right now, lock classes are created

Thanks for looking into this! Could you also copy locking maintainers in
the next version?

> > >> using macros and passed around (similar to C). Unfortunately, Rust
> > >> macros don't look like Rust functions, which means adding lockdep to a
> > >> type is a breaking API change. This makes Rust mutex creation rather
> > >> ugly, with the new_mutex!() macro and friends.
> > >>
> > >> Implicit lock classes have to be unique per instantiation code site.
> > >> Notably, with Rust generics and monomorphization, this is not the same
> > >> as unique per generated code instance. If this weren't the case, we
> > >> could use inline functions and asm!() magic to try to create lock
> > >> classes that have the right uniqueness semantics. But that doesn't work,
> > >> since it would create too many lock classes for the same actual lock
> > >> creation in the source code.
> > >>
> > >> But Rust does have one trick we can use: it can track the caller
> > >> location (as file:line:column), across multiple functions. This works
> > >> using an implicit argument that gets passed around, which is exactly the
> > >> thing we do for lock classes. The tricky bit is that, while the value of
> > >> these Location objects has the semantics we want (unique value per
> > >> source code location), there is no guarantee that they are deduplicated
> > >> in memory.
> > >>
> > >> So we use a hash table, and map Location values to lock classes. Et
> > >> voila, implicit lock class support!
> > >>
> > >> This lets us clean up the Mutex & co APIs and make them look a lot more
> > >> Rust-like, but it also means we can now throw Lockdep into more APIs
> > >> without breaking the API. And so we can pull a neat trick: adding
> > >> Lockdep support into Arc<T>. This catches cases where the Arc Drop
> > >> implementation could create a locking correctness violation only when
> > >> the reference count drops to 0 at that particular drop site, which is
> > >> otherwise not detectable unless that condition actually happens at
> > >> runtime. Since Drop is "magic" in Rust and Drop codepaths very difficult
> > >> to audit, this helps a lot.
> > >>
> > >> For the initial RFC, this implements the new API only for Mutex. If this
> > >> looks good, I can extend it to CondVar & friends in the next version.
> > >> This series also folds in a few related minor dependencies / changes
> > >> (like the pin_init mutex stuff).
> > >
> > > I'm not convinced that this is the right compromise. Moving lockdep
> > > class creation to runtime sounds unfortunate, especially since this
> > > makes them fallible due to memory allocations (I think?).
> > >
> > > I would be inclined to keep using macros for this.
> >
> > Most people were very enthusiastic about this change in the meetings...
> > it wasn't even my own idea ^^
>
> I don't think I was in that meeting. Anyway,
>
> > I don't think the fallibility is an issue. Lockdep is a debugging tool,
> > and it doesn't have to handle all possible circumstances perfectly. If
> > you are debugging normal lock issues you probably shouldn't be running
> > out of RAM, and if you are debugging OOM situations the lock keys would
> > normally have been created long before you reach an OOM situation, since
> > they would be created the first time a relevant lock class is used. More
> > objects of the same class don't cause any more allocations. And the code
> > has a fallback for the OOM case, where it just uses the Location object
> > as a static lock class. That's not ideal and degrades the quality of the
> > lockdep results, but it shouldn't completely break anything.
>
> If you have a fallback when the allocation fails, that helps ...
>
> You say that Location objects are not necessarily unique per file
> location. In practice, how often are they not unique? Always just using
> the Location object as a static lock class seems like it would
> significantly simplify this proposal.
>

Agreed. For example, `caller_lock_class_inner` has a Mutex critical
section in it (for the hash table synchronization), that makes it
impossible to be called in preemption disabled contexts, which limits
the usage.

Regards,
Boqun

> > The advantages of being able to throw lockdep checking into arbitrary
> > types, like the Arc<T> thing, are pretty significant. It closes a major
> > correctness checking issue we have with Rust and its automagic Drop
> > implementations that are almost impossible to properly audit for
> > potential locking issues. I think that alone makes this worth it, even
> > if you don't use it for normal mutex creation...
>
> I do agree that there is value in being able to more easily detect
> potential deadlocks involving destructors of ref-counted values. I once
> had a case of that myself, though lockdep was able to catch it without
> this change because it saw the refcount hit zero in the right place.
>
> Alice
>