Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0

From: Mel Gorman
Date: Fri Jun 03 2016 - 04:41:53 EST


On Fri, Jun 03, 2016 at 09:57:22AM +0200, Geert Uytterhoeven wrote:
> Hi Andrew, Mel,
>
> On Thu, Jun 2, 2016 at 8:43 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Thu, 2 Jun 2016 13:19:36 +0100 Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> >> > >Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> >> >
> >> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
> >> >
> >>
> >> Thanks.
> >
> > I queued this. A tested-by:Geert would be nice?
> >
> >
> > From: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > Subject: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies
> >
> > The optimistic fast path may use cpuset_current_mems_allowed instead of of
> > a NULL nodemask supplied by the caller for cpuset allocations. The
> > preferred zone is calculated on this basis for statistic purposes and as a
> > starting point in the zonelist iterator.
> >
> > However, if the context can ignore memory policies due to being atomic or
> > being able to ignore watermarks then the starting point in the zonelist
> > iterator is no longer correct. This patch resets the zonelist iterator in
> > the allocator slowpath if the context can ignore memory policies. This
> > will alter the zone used for statistics but only after it is known that it
> > makes sense for that context. Resetting it before entering the slowpath
> > would potentially allow an ALLOC_CPUSET allocation to be accounted for
> > against the wrong zone. Note that while nodemask is not explicitly set to
> > the original nodemask, it would only have been overwritten if
> > cpuset_enabled() and it was reset before the slowpath was entered.
> >
> > Link: http://lkml.kernel.org/r/20160602103936.GU2527@xxxxxxxxxxxxxxxxxxx
> > Fixes: c33d6c06f60f710 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
>
> My understanding was that this was an an additional patch, not fixing
> the problem in-se?
>

It doesn't fix the problem you had, it is a follow-on patch that
potentially affects.

> Indeed, after applying this patch (without the other one that added
> "z = ac->preferred_zoneref;" to the reset_fair block of
> get_page_from_freelist()) I still get crashes...
>

The patch you have is the only one required for the crash. This patch
handles a corner case with atomic allocations that can ignore memory
policies.

> Now testing with both applied...

Thanks.

--
Mel Gorman
SUSE Labs