Re: [PATCH] mm: warn about allocations which stall for too long

From: Tetsuo Handa
Date: Fri Sep 23 2016 - 10:36:55 EST


Michal Hocko wrote:
> @@ -3659,6 +3661,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> else
> no_progress_loops++;
>
> + /* Make sure we know about allocations which stall for too long */
> + if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {

Should we check !__GFP_NOWARN ? I think __GFP_NOWARN is likely used with
__GFP_NORETRY, and __GFP_NORETRY is already checked by now.

I think printing warning regardless of __GFP_NOWARN is better because
this check is similar to hungtask warning.

> + pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
> + current->comm, jiffies_to_msecs(jiffies-alloc_start),
> + order, gfp_mask, &gfp_mask);
> + stall_timeout += 10 * HZ;
> + dump_stack();

Can we move this pr_warn() + dump_stack() to a separate function like

static void __warn_memalloc_stall(unsigned int order, gfp_t gfp_mask, unsigned long alloc_start)
{
pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
current->comm, jiffies_to_msecs(jiffies-alloc_start),
order, gfp_mask, &gfp_mask);
dump_stack();
}

in order to allow SystemTap scripts to perform additional actions by name (e.g.

# stap -g -e 'probe kernel.function("__warn_memalloc_stall").return { panic(); }

) rather than by line number, and surround __warn_memalloc_stall() call with
mutex in order to serialize warning messages because it is possible that
multiple allocation requests are stalling?

> + }
> +
> if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
> did_some_progress > 0, no_progress_loops))
> goto retry;
> --
> 2.9.3