Re: [PATCH] Revert "sched/deadline: Remove cpu_active_mask from cpudl_find()"

From: Juri Lelli
Date: Thu Jun 25 2020 - 09:34:54 EST


Hi,

On 24/06/20 23:13, Sai Harshini Nimmala wrote:
> The original commit 9659e1ee removes checking the cpu_active_mask
> while finding the best cpu to place a deadline task, citing the reason that
> this mask rarely changes and removing the check will give performance
> gains.
> However, on hotplugging, the cpu dying path has a brief duration between
> the CPUHP_TEARDOWN_CPU and CPUHP_AP_SCHED_STARTING hotplug states where
> the DL task can be scheduled on this cpu because the corresponding cpu
> bit in cpu->free_cpus has not been cleared yet. Without the
> cpu_active_mask check we could end up putting a DL task on such cpus
> leading to a BUG.
> The cpu_active_mask will be updated promptly before either of these
> states and will provide a more accurate check for the use case above.
>
> Signed-off-by: Puja Gupta <pujag@xxxxxxxxxxxxxx>
> Signed-off-by: Sai Harshini Nimmala <snimmala@xxxxxxxxxxxxxx>
> ---
> kernel/sched/cpudeadline.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 5cc4012..0346837 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -120,7 +120,8 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
> const struct sched_dl_entity *dl_se = &p->dl;
>
> if (later_mask &&
> - cpumask_and(later_mask, cp->free_cpus, p->cpus_ptr)) {
> + cpumask_and(later_mask, cp->free_cpus, p->cpus_ptr) &&
> + cpumask_and(later_mask, later_mask, cpu_active_mask)) {
> return 1;
> } else {
> int best_cpu = cpudl_maximum(cp);

So, I believe the patch you want to revert only removed the condition
above.

> @@ -128,6 +129,7 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
> WARN_ON(best_cpu != -1 && !cpu_present(best_cpu));
>
> if (cpumask_test_cpu(best_cpu, p->cpus_ptr) &&
> + cpumask_test_cpu(best_cpu, cpu_active_mask) &&
> dl_time_before(dl_se->deadline, cp->elements[0].dl)) {
> if (later_mask)
> cpumask_set_cpu(best_cpu, later_mask);

Did you actually experience issues with this second part as well? I'm
thinking the WARN_ON should have fired in that case, no?

Thanks,

Juri