Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0

From: Geert Uytterhoeven
Date: Fri Jun 03 2016 - 05:00:42 EST


Hi Mel,

On Fri, Jun 3, 2016 at 10:41 AM, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jun 03, 2016 at 09:57:22AM +0200, Geert Uytterhoeven wrote:
>> On Thu, Jun 2, 2016 at 8:43 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> > On Thu, 2 Jun 2016 13:19:36 +0100 Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>> >> > >Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> >> >
>> >> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
>> >>
>> >> Thanks.
>> >
>> > I queued this. A tested-by:Geert would be nice?
>> >
>> > From: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> > Subject: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies
>> >
>> > The optimistic fast path may use cpuset_current_mems_allowed instead of of
>> > a NULL nodemask supplied by the caller for cpuset allocations. The
>> > preferred zone is calculated on this basis for statistic purposes and as a
>> > starting point in the zonelist iterator.
>> >
>> > However, if the context can ignore memory policies due to being atomic or
>> > being able to ignore watermarks then the starting point in the zonelist
>> > iterator is no longer correct. This patch resets the zonelist iterator in
>> > the allocator slowpath if the context can ignore memory policies. This
>> > will alter the zone used for statistics but only after it is known that it
>> > makes sense for that context. Resetting it before entering the slowpath
>> > would potentially allow an ALLOC_CPUSET allocation to be accounted for
>> > against the wrong zone. Note that while nodemask is not explicitly set to
>> > the original nodemask, it would only have been overwritten if
>> > cpuset_enabled() and it was reset before the slowpath was entered.
>> >
>> > Link: http://lkml.kernel.org/r/20160602103936.GU2527@xxxxxxxxxxxxxxxxxxx
>> > Fixes: c33d6c06f60f710 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
>>
>> My understanding was that this was an an additional patch, not fixing
>> the problem in-se?
>
> It doesn't fix the problem you had, it is a follow-on patch that
> potentially affects.

Thanks for confirming!

>> Indeed, after applying this patch (without the other one that added
>> "z = ac->preferred_zoneref;" to the reset_fair block of
>> get_page_from_freelist()) I still get crashes...
>
> The patch you have is the only one required for the crash. This patch
> handles a corner case with atomic allocations that can ignore memory
> policies.

OK.

In the mean time my tests completed successfully with both patches applied.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds