Re: [PATCH 2/9] xfs: introduce and use KM_NOLOCKDEP to silence reclaim lockdep false positives

From: Dave Chinner
Date: Tue Dec 20 2016 - 16:39:17 EST


On Mon, Dec 19, 2016 at 02:06:19PM -0800, Darrick J. Wong wrote:
> On Tue, Dec 20, 2016 at 08:24:13AM +1100, Dave Chinner wrote:
> > On Thu, Dec 15, 2016 at 03:07:08PM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@xxxxxxxx>
> > >
> > > Now that the page allocator offers __GFP_NOLOCKDEP let's introduce
> > > KM_NOLOCKDEP alias for the xfs allocation APIs. While we are at it
> > > also change KM_NOFS users introduced by b17cb364dbbb ("xfs: fix missing
> > > KM_NOFS tags to keep lockdep happy") and use the new flag for them
> > > instead. There is really no reason to make these allocations contexts
> > > weaker just because of the lockdep which even might not be enabled
> > > in most cases.
> > >
> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> >
> > I'd suggest that it might be better to drop this patch for now -
> > it's not necessary for the context flag changeover but does
> > introduce a risk of regressions if the conversion is wrong.
>
> I was just about to write in that while I didn't see anything obviously
> wrong with the NOFS removals, I also don't know for sure that we can't
> end up recursively in those code paths (specifically the directory
> traversal thing).

The issue is with code paths that can be called from both inside and
outside transaction context - lockdep complains when it sees an
allocation path that is used with both GFP_NOFS and GFP_KERNEL
context, as it doesn't know that the GFP_KERNEL usage is safe or
not.

So things like the directory buffer path, which can be called from
readdir without a transaction context, have various KM_NOFS flags
scattered through it so that lockdep doesn't get all upset every
time readdir is called...

There are other cases like this - btree manipulation via bunmapi()
can be called without transaction context to remove delayed alloc
extents, and that puts all of the btree cursor and incore extent
list handling in the same boat (all those allocations are KM_NOFS),
etc.

So it's not really recursion that is the problem here - it's
different allocation contexts that lockdep can't know about unless
it's told about them. We've done that with KM_NOFS in the past; in
future we should use this KM_NOLOCKDEP flag, though I'd prefer a
better name for it. e.g. KM_NOTRANS to indicate that the allocation
can occur both inside and outside of transaction context....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx