Re: [RFC PATCH] time/nohz: allow the boot CPU to be nohz_full

From: Frederic Weisbecker
Date: Wed Jan 16 2019 - 12:54:46 EST


On Mon, Jan 14, 2019 at 04:47:45PM +1000, Nicholas Piggin wrote:
> We have a supercomputer site testing nohz_full to reduce jitter with
> good results, but they want CPU0 to be nohz_full. That happens to be
> the boot CPU, which is disallowed by the nohz_full code.
>
> They have existing job scheduling code which wants this, I don't know
> too much detail beyond that, but I hope the kernel can be made to
> work with their config.
>
> This patch has the boot CPU take over the jiffies update in the low
> res timer before SMP is brought up, after which the nohz CPU will take
> over.
>
> It also modifies the housekeeping check code a bit to ensure at least
> one !nohz CPU is in the present map so it comes up at boot, rather
> than having the nohz code take the boot CPU out of the nohz mask.
>
> This keeps jiffies incrementing on the nohz_full boot CPU before SMP
> init, but I'm not sure if this is covering all races and platform
> considerations. Sorry I don't know the timer code too well, I would
> appreciate any help.
>
> Thanks,
> Nick

We used to allow that and that broke hibernation :)

So, since we need to have at least one CPU alive to handle the
timekeeping updates on behalf of nohz CPUs, we forbid it to go idle
and offline, for simplicity. Now hibernation requires to disable
non-boot CPUs. So if the timekeeper is not the boot CPU, it's going to
refuse the hotplug operation and break hibernation.