Re: [PATCH v4 72/73] xfs: Convert mru cache to XArray

From: Dave Chinner
Date: Fri Dec 08 2017 - 17:37:07 EST


On Fri, Dec 08, 2017 at 12:35:07PM -0500, Alan Stern wrote:
> On Fri, 8 Dec 2017, Byungchul Park wrote:
>
> > I'm sorry to hear that.. If I were you, I would also get
> > annoyed. And.. thanks for explanation.
> >
> > But, I think assigning lock classes properly and checking
> > relationship of the classes to detect deadlocks is reasonable.
> >
> > In my opinion about the common lockdep stuff, there are 2
> > problems on it.
> >
> > 1) Firstly, it's hard to assign lock classes *properly*. By
> > default, it relies on the caller site of lockdep_init_map(),
> > but we need to assign another class manually, where ordering
> > rules are complicated so cannot rely on the caller site. That
> > *only* can be done by experts of the subsystem.

Sure, but that's not the issue here. The issue here is the lack of
communication with subsystem experts and that the annotation
complexity warnings given immediately by the subsystem experts were
completely ignored...

> > I think if they want to get benifit from lockdep, they have no
> > choice but to assign classes manually with the domain knowledge,
> > or use *lockdep_set_novalidate_class()* to invalidate locks
> > making the developers annoyed and not want to use the checking
> > for them.
>
> Lockdep's no_validate class is used when the locking patterns are too
> complicated for lockdep to understand. Basically, it tells lockdep to
> ignore those locks.

Let me just point out two things here:

1. Using lockdep_set_novalidate_class() for anything other
than device->mutex will throw checkpatch warnings. Nice. (*)

2. lockdep_set_novalidate_class() is completely undocumented
- it's the first I've ever heard of this functionality. i.e.
nobody has ever told us there is a mechanism to turn off
validation of an object; we've *always* been told to "change
your code and/or fix your annotations" when discussing
lockdep deficiencies. (**)

> The device core uses that class. The tree of struct devices, each with
> its own lock, gets used in many different and complicated ways.
> Lockdep can't understand this -- it doesn't have the ability to
> represent an arbitrarily deep hierarchical tree of locks -- so we tell
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

That largely describes the in-memory structure of XFS, except we
have a forest of lock trees, not just one....

> it to ignore the device locks.
>
> It sounds like XFS may need to do the same thing with its semaphores.

Who-ever adds semaphore checking to lockdep can add those
annotations. The externalisation of the development cost of new
lockdep functionality is one of the problems here.

-Dave.

(*) checkpatch.pl is considered mostly harmful round here, too,
but that's another rant....

(**) the frequent occurrence of "core code/devs aren't held to the
same rules/standard as everyone else" is another rant I have stored
up for a rainy day.

--
Dave Chinner
david@xxxxxxxxxxxxx