Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
From: Michal Hocko
Date: Wed Sep 13 2017 - 07:54:51 EST
On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
>
>
> On Mon, 11 Sep 2017, Michal Hocko wrote:
>
> > On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > > I am occasionally getting these warnings in khugepaged. It is an old
> > > machine with 550MHz CPU and 512 MB RAM.
> > >
> > > Note that khugepaged has nice value 19, so when the machine is loaded with
> > > some work, khugepaged is stalled and this stall produces warning in the
> > > allocator.
> > >
> > > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > > is masked off when calling warn_alloc. This patch removes the masking of
> > > __GFP_NOWARN, so that the warning is suppressed.
> > >
> > > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > > Call Trace:
> > > ? warn_alloc+0xb9/0x140
> > > ? __alloc_pages_nodemask+0x724/0x880
> > > ? arch_irq_stat_cpu+0x1/0x40
> > > ? detach_if_pending+0x80/0x80
> > > ? khugepaged+0x10a/0x1d40
> > > ? pick_next_task_fair+0xd2/0x180
> > > ? wait_woken+0x60/0x60
> > > ? kthread+0xcf/0x100
> > > ? release_pte_page+0x40/0x40
> > > ? kthread_create_on_node+0x40/0x40
> > > ? ret_from_fork+0x19/0x30
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> >
> > This patch hasn't introduced this behavior. It deliberately skipped
> > warning on __GFP_NOWARN. This has been introduced later by 822519634142
> > ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> > disagreed [1] but overall consensus was that such a warning won't be
> > harmful. Could you be more specific why do you consider it wrong,
> > please?
>
> I consider the warning wrong, because it warns when nothing goes wrong.
> I've got 7 these warnings for 4 weeks of uptime. The warnings typically
> happen when I run some compilation.
>
> A process with low priority is expected to be running slowly when there's
> some high-priority process, so there's no need to warn that the
> low-priority process runs slowly.
I would tend to agree. It is certainly a noise in the log. And a kind of
thing I was worried about when objecting the patch previously.
> What else can be done to avoid the warning? Skip the warning if the
> process has lower priority?
No, I wouldn't play with priorities. Either we agree that NOWARN
allocations simply do _not_warn_ or we simply explain users that some of
those warnings might not be that critical and overloaded system might
show them.
Let's see what others think about this.
--
Michal Hocko
SUSE Labs