[PATCH v6 06/12] smp: Enable preemption early in smp_call_function_many_cond

From: Chuyi Zhou

Date: Thu May 28 2026 - 11:27:58 EST


Disabling preemption entirely during smp_call_function_many_cond() was
primarily for the following reasons:

- To prevent the remote online CPU from going offline. Specifically, we
want to ensure that no new csds are queued after smpcfd_dying_cpu() has
finished. Therefore, preemption must be disabled until all necessary IPIs
are sent.

- To prevent current CPU from going offline. Being migrated to another CPU
and calling csd_lock_wait() may cause UAF due to smpcfd_dead_cpu() during
the current CPU offline process.

- To protect the per-cpu cfd_data from concurrent modification by other
tasks on the current CPU. cfd_data contains cpumasks and per-cpu csds.
Before enqueueing a csd, we block on the csd_lock() to ensure the
previous async csd->func() has completed, and then initialize csd->func and
csd->info. After sending the IPI, we spin-wait for the remote CPU to call
csd_unlock(). Actually the csd_lock mechanism already guarantees csd
serialization. If preemption occurs during csd_lock_wait, other concurrent
smp_call_function_many_cond calls will simply block until the previous
csd->func() completes:

task A task B

sd->func = fun_a
send ipis

preempted by B
--------------->
csd_lock(csd); // block until last
// fun_a finished

csd->func = func_b;
csd->info = info;
...
send ipis

switch back to A
<---------------

csd_lock_wait(csd); // block until remote finish func_*

Previous patches replaced the per-cpu cfd->cpumask with task-local cpumask,
and the percpu csd is allocated only once and is never freed to ensure
we can safely access csd. Now we can enable preemption before
csd_lock_wait() which makes the potentially unpredictable csd_lock_wait()
preemptible and migratable.

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
kernel/smp.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 9ef136bacda0..5cb09a84263b 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -859,15 +859,14 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
unsigned int scf_flags,
smp_cond_func_t cond_func)
{
- int cpu, last_cpu, this_cpu = smp_processor_id();
+ int cpu, last_cpu, this_cpu;
struct call_function_data *cfd;
bool wait = scf_flags & SCF_WAIT;
struct cpumask *cpumask, *task_mask;
int nr_cpus = 0;
bool run_remote = false;

- lockdep_assert_preemption_disabled();
-
+ this_cpu = get_cpu();
task_mask = smp_task_ipi_mask(current);
cfd = this_cpu_ptr(&cfd_data);
if (task_mask)
@@ -953,6 +952,17 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
local_irq_restore(flags);
}

+ /*
+ * Waiting for completion can take time especially with many CPUs.
+ * On a PREEMPTIBLE kernel a per-task cpumask is used to track CPUs
+ * with pending IPI request. This allows to enable preemption and
+ * potentially wait while allowing task preemption. On a !PREEMPTIBLE
+ * the cpumask is shared and the call must block until completion to
+ * avoid modifications by a another caller on this CPU.
+ */
+ if (task_mask)
+ put_cpu();
+
if (run_remote && wait) {
for_each_cpu(cpu, cpumask) {
call_single_data_t *csd;
@@ -961,6 +971,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
csd_lock_wait(csd);
}
}
+
+ if (!task_mask)
+ put_cpu();
}

/**
@@ -972,8 +985,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
* on other CPUs.
*
* You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler. Preemption
- * must be disabled when calling this function.
+ * hardware interrupt handler or from a bottom half handler.
*
* @func is not called on the local CPU even if @mask contains it. Consider
* using on_each_cpu_cond_mask() instead if this is not desirable.
--
2.20.1