Re: [PATCH RFC] drivers/core: Replace lockdep_set_novalidate_class() with unique class keys

From: Boqun Feng
Date: Tue Feb 14 2023 - 00:55:17 EST


On Tue, Feb 14, 2023 at 01:27:33PM +0800, Hillf Danton wrote:
> On Mon, 13 Feb 2023 18:09:16 -0800 Boqun Feng <boqun.feng@xxxxxxxxx>
> > On Mon, Feb 13, 2023 at 09:03:14PM -0500, Alan Stern wrote:
> > > On Mon, Feb 13, 2023 at 05:51:11PM -0800, Boqun Feng wrote:
> > > > Basically if you have two lock instances A and B with the same class,
> > > > and you know that locking ordering is always A -> B, then you can do
> > > >
> > > > mutex_lock(A);
> > > > mutex_lock_nest_lock(B, A); // lock B.
> > > >
> > > > to tell the lockdep this is not deadlock, plus lockdep will treat the
> > > > acquisition of A and the precondition of acquisition B, so the following
> > > > is not a deadlock as well:
> > > >
> > > > T1:
> > > > mutex_lock(A);
> > > > mutex_lock(C);
> > > > mutex_lock_nest_lock(B, A);
> > > >
> > > > T2:
> > > > mutex_lock(A);
> > > > mutex_lock_nest_lock(B, A);
> > > > mutex_lock(C);
> > >
> > > Why isn't this treated as a deadlock? It sure looks like a deadlock to
> > > me. Is this an example where lockdep just doesn't get the right answer?
>
> Syzbot reported deadlock[1] with A ignored.
>
> [1] https://lore.kernel.org/linux-mm/20230130073136.59-1-hdanton@xxxxxxxx/
>

Right, I think that's a false positive, however it's not related to
mutex_lock_nest_lock(). Anyway mutex_lock_nest_lock() cannot help that
case since these are three different lock class.

Actually, reading the code again, I think I made a mistake, for
mutex_lock_nest_lock(), the following *is* a deadlock to lockdep:

T1:
mutex_lock(A);
mutex_lock(C);
mutex_lock_nest_lock(B, A);

T2:
mutex_lock(A);
mutex_lock_nest_lock(B, A);
mutex_lock(C);

and this *is not* a deadlock to lockdep:

T1:
mutex_lock(A);
mutex_lock_nest_lock(C, A);
mutex_lock_nest_lock(B, A);

T2:
mutex_lock(A);
mutex_lock_nest_lock(B, A);
mutex_lock_nest_lock(C, A);

The current semantics of _nest_lock() is tricky, it only provides the
"nest" effect if it is the next lock acquired after the "parent" lock.
Maybe we can change and make it clear a little bit to make it more
useful.

Ah, actually someone found it 7 years ago:

https://lore.kernel.org/lkml/20150810095247.GA4606@xxxxxxxxxxxxxxxxxxxxxxx/

;-)

Regards,
Boqun

> > Because A serializes B and C, so that particular piece of code doesn't
> > cause deadlock. Note that you can still use you normal mutex_lock() for
> > B, so if there is more code:
> >
> > T3:
> > mutex_lock(C);
> > mutex_lock(B);
> >
> > lockdep will report deadlock.
> >
> > Regards,
> > Boqun
> >
> > > Alan Stern