Re: [PATCH 11/20] sched: Handle CPU isolation on last resort fallback rq selection
From: Michal Hocko
Date: Fri Sep 27 2024 - 03:26:43 EST
On Fri 27-09-24 00:48:59, Frederic Weisbecker wrote:
> When a kthread or any other task has an affinity mask that is fully
> offline or unallowed, the scheduler reaffines the task to all possible
> CPUs as a last resort.
>
> This default decision doesn't mix up very well with nohz_full CPUs that
> are part of the possible cpumask but don't want to be disturbed by
> unbound kthreads or even detached pinned user tasks.
>
> Make the fallback affinity setting aware of nohz_full. This applies to
> all architectures supporting nohz_full except arm32. However this
> architecture that overrides the task possible mask is unlikely to be
> willing to integrate new development.
>
> Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
Thanks, this makes sense to me. Up to scheduler maitainers whether this
makes sense in general though.
Thanks for looking into this Frederic!
> ---
> kernel/sched/core.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 43e453ab7e20..d4b759c1cbf1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3421,6 +3421,21 @@ void kick_process(struct task_struct *p)
> }
> EXPORT_SYMBOL_GPL(kick_process);
>
> +static const struct cpumask *task_cpu_fallback_mask(struct task_struct *t)
> +{
> + const struct cpumask *mask;
> +
> + mask = task_cpu_possible_mask(p);
> + /*
> + * Architectures that overrides the task possible mask
> + * must handle CPU isolation.
> + */
> + if (mask != cpu_possible_mask)
> + return mask;
> + else
> + return housekeeping_cpumask(HK_TYPE_TICK);
> +}
> +
> /*
> * ->cpus_ptr is protected by both rq->lock and p->pi_lock
> *
> @@ -3489,7 +3504,7 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
> *
> * More yuck to audit.
> */
> - do_set_cpus_allowed(p, task_cpu_possible_mask(p));
> + do_set_cpus_allowed(p, task_cpu_fallback_mask(p));
> state = fail;
> break;
> case fail:
> --
> 2.46.0
--
Michal Hocko
SUSE Labs