Re: [v5 1/2] mm: disable interrupts while initializing deferred pages
From: Andrew Morton
Date: Tue Mar 13 2018 - 16:12:05 EST
On Tue, 13 Mar 2018 15:45:46 -0400 Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> wrote:
> > >
> > > We must remove cond_resched() because we can't sleep anymore. They were
> > > added to fight NMI timeouts, so I will replace them with
> > > touch_nmi_watchdog() in a follow-up fix.
> >
> > This makes no sense. Any code section where we can add cond_resched()
> > was never subject to NMI timeouts because that code cannot be running with
> > disabled interrupts.
> >
>
> Hi Andrew,
>
> I was talking about this patch:
>
> 9b6e63cbf85b89b2dbffa4955dbf2df8250e5375
> mm, page_alloc: add scheduling point to memmap_init_zone
>
> Which adds cond_resched() to memmap_init_zone() to avoid NMI timeouts.
>
> memmap_init_zone() is used both, early in boot, when non-deferred struct
> pages are initialized, but also may be used later, during memory hotplug.
>
> As I understand, the later case could cause the timeout on non-preemptible
> kernels.
>
> My understanding, is that the same logic was used here when cond_resched()s
> were added.
>
> Please correct me if I am wrong.
Yes, the message is a bit confusing and the terminology is perhaps
vague. And it's been a while since I played with this stuff, so from
(dated) memory:
Soft lockup: kernel has run for too long without rescheduling
Hard lockup: kernel has run for too long with interrupts disabled
Both of these are detected by the NMI watchdog handler.
9b6e63cbf85b89b2d fixes a soft lockup by adding a manual rescheduling
point. Replacing that with touch_nmi_watchdog() won't work (I think).
Presumably calling touch_softlockup_watchdog() will "work", in that it
suppresses the warning. But it won't fix the thing which the warning
is actually warning about: starvation of the CPU scheduler. That's
what the cond_resched() does.
I'm not sure what to suggest, really. Your changelog isn't the best:
"Vlastimil Babka reported about a window issue during which when
deferred pages are initialized, and the current version of on-demand
initialization is finished, allocations may fail". Well... where is
ths mysterious window? Without such detail it's hard for others to
suggest alternative approaches.