Re: [4.15-rc9] fs_reclaim lockdep trace
From: Peter Zijlstra
Date: Mon Jan 29 2018 - 05:28:06 EST
On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote:
> This warning seems to be caused by commit d92a8cfcb37ecd13
> ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the
> location of
>
> /* this guy won't enter reclaim */
> if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
> return false;
>
> check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context
> (__GFP_NOFS)").
I'm not entirly sure I get what you mean here. How did I move it? It was
part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not
mark the lock as held.
The new code has it in fs_reclaim_acquire/release to the same effect, if
__GFP_NOMEMALLOC, we'll not acquire/release the lock.
> Since __kmalloc_reserve() from __alloc_skb() adds
> __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is
> failing to return false despite PF_MEMALLOC context (and resulted in
> lockdep warning).
But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC.
That's what the name says.
> Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking
> __GFP_NOMEMALLOC might make sense. But since this safeguard was added by
> commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for
> allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense.
> Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to
> return false.
This does not in fact explain what's going on, it just points to
'random' patches.
Are you talking about this:
+ /* Avoid recursion of direct reclaim */
+ if (p->flags & PF_MEMALLOC)
+ goto nopage;
bit?
> Reported-by: Dave Jones <davej@xxxxxxxxxxxxxxxxx>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Nick Piggin <npiggin@xxxxxxxxx>
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688..7804b0e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> return false;
>
> /* this guy won't enter reclaim */
> - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
> + if (current->flags & PF_MEMALLOC)
> return false;
I'm _really_ uncomfortable doing that. Esp. without a solid explanation
of how this raelly can't possibly lead to trouble. Which the above semi
incoherent rambling is not.
Your backtrace shows the btrfs shrinker doing an allocation, that's the
exact kind of thing we need to be extremely careful with.