Re: [PATCH v2] sched/isolation: Make use of more than one housekeeping cpu
From: Frederic Weisbecker
Date: Tue Mar 04 2025 - 08:24:29 EST
Le Tue, Feb 18, 2025 at 06:46:18PM +0000, Phil Auld a écrit :
> The exising code uses housekeeping_any_cpu() to select a cpu for
> a given housekeeping task. However, this often ends up calling
> cpumask_any_and() which is defined as cpumask_first_and() which has
> the effect of alyways using the first cpu among those available.
>
> The same applies when multiple NUMA nodes are involved. In that
> case the first cpu in the local node is chosen which does provide
> a bit of spreading but with multiple HK cpus per node the same
> issues arise.
>
> We have numerous cases where a single HK cpu just cannot keep up
> and the remote_tick warning fires. It also can lead to the other
> things (orchastration sw, HA keepalives etc) on the HK cpus getting
> starved which leads to other issues. In these cases we recommend
> increasing the number of HK cpus. But... that only helps the
> userspace tasks somewhat. It does not help the actual housekeeping
> part.
>
> Spread the HK work out by having housekeeping_any_cpu() and
> sched_numa_find_closest() use cpumask_any_and_distribute()
> instead of cpumask_any_and().
>
> Signed-off-by: Phil Auld <pauld@xxxxxxxxxx>
> Reviewed-by: Waiman Long <longman@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Waiman Long <longman@xxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Link: https://lore.kernel.org/lkml/20250211141437.GA349314@xxxxxxxxxxxxxxxxxx/
Acked-by: Frederic Weisbecker <frederic@xxxxxxxxxx>