Re: [PATCH v8 06/14] smp: Enable preemption early in smp_call_function_many_cond()
From: Thomas Gleixner
Date: Fri Jun 26 2026 - 10:43:06 EST
On Tue, Jun 16 2026 at 19:11, Chuyi Zhou wrote:
> Disabling preemption entirely during smp_call_function_many_cond() was
> primarily for the following reasons:
>
> - To prevent the remote online CPU from going offline. Specifically, we
> want to ensure that no new csds are queued after smpcfd_dying_cpu() has
> finished. Therefore, preemption must be disabled until all necessary IPIs
> are sent.
>
> - To prevent current CPU from going offline. Being migrated to another CPU
> and calling csd_lock_wait() may cause UAF due to smpcfd_dead_cpu() during
> the current CPU offline process.
>
> - To protect the per-cpu cfd_data from concurrent modification by other
> tasks on the current CPU. cfd_data contains cpumasks and per-cpu csds.
> Before enqueueing a csd, we block on the csd_lock() to ensure the
> previous async csd->func() has completed, and then initialize csd->func and
> csd->info. After sending the IPI, we spin-wait for the remote CPU to call
> csd_unlock(). Actually the csd_lock mechanism already guarantees csd
> serialization. If preemption occurs during csd_lock_wait, other concurrent
> smp_call_function_many_cond calls will simply block until the previous
> csd->func() completes:
Please format properly.
> task A task B
>
> sd->func = fun_a
> send ipis
>
> preempted by B
> --------------->
> csd_lock(csd); // block until last
> // fun_a finished
>
> csd->func = func_b;
> csd->info = info;
> ...
> send ipis
>
> switch back to A
> <---------------
>
> csd_lock_wait(csd); // block until remote finish func_*
>
> Previous patches replaced the per-cpu cfd->cpumask with task-local cpumask,
The per CPU cfd->cpumask has been replaced with a task local cpumask....
> and the percpu csd is allocated only once and is never freed to ensure
> we can safely access csd. Now we can enable preemption before
> csd_lock_wait() which makes the potentially unpredictable csd_lock_wait()
> preemptible and migratable.
With that in place enable preemption before ....
> + this_cpu = get_cpu();
> task_mask = smp_task_ipi_mask(current);
> cfd = this_cpu_ptr(&cfd_data);
> if (task_mask)
> @@ -953,6 +952,17 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> local_irq_restore(flags);
> }
>
> + /*
> + * Waiting for completion can take time, especially with many CPUs.
> + * On a PREEMPT kernel a per-task cpumask is used to track CPUs with
> + * pending IPI requests. This allows preemption to be enabled before
> + * waiting. On a !PREEMPT kernel the cpumask is shared and the call
> + * must block until completion to avoid modifications by another caller
> + * on this CPU.
> + */
> + if (task_mask)
> + put_cpu();
What's this conditional for?.
If CONFIG_PREEMPTION is disabled preempt_enable() never results in
preemption, which means the shared per CPU mask is implicitely protected
and get/put_cpu() are completely unrelated to that.
So please make this unconditional end rewrite this completely misleading
comment.
Thanks,
tglx