[PATCH 1/2] smp: Non busy-waiting IPI queue

From: Frederic Weisbecker
Date: Wed Apr 02 2014 - 12:26:44 EST

Some IPI users, such as the nohz subsystem, need to be able to send
an async IPI (async = non waiting for any other IPI completion) on
contexts with disabled interrupts. And we want the IPI subsystem to handle
concurrent calls by itself.

Currently the nohz machinery uses the scheduler IPI for this purpose
because it can be triggered from any context and doesn't need any
serialization from the caller. But this is an abuse of a scheduler
fast path. We are bloating it with a job that should use its own IPI.

The current set of IPI functions can't be called when interrupts are
disabled otherwise we risk a deadlock when two CPUs wait for each
other's IPI completion.

OTOH smp_call_function_single_async() can be called when interrupts
are disabled. But then it's up to the caller to serialize the given
IPI. This can't be called concurrently without special care.

So we need a version of the async IPI that takes care of concurrent

The proposed solution is to synchronize the IPI with a specific flag
that prevents the IPI from being sent if it is already pending but not
yet executed. Ordering is maintained such that, if the IPI is not sent
because it's already pending, we guarantee it will see the new state of
the data we expect it to when it will execute.

This model is close to the irq_work design. It's also partly inspired by
suggestions from Peter Zijlstra.

Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Jens Axboe <jens.axboe@xxxxxxxxxx>
Cc: Kevin Hilman <khilman@xxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
include/linux/smp.h | 12 ++++++++++++
kernel/smp.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..155dc86 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -29,6 +29,18 @@ extern unsigned int total_cpus;
int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
int wait);

+struct queue_single_data;
+typedef void (*smp_queue_func_t)(struct queue_single_data *qsd);
+struct queue_single_data {
+ struct call_single_data data;
+ smp_queue_func_t func;
+ int pending;
+int smp_queue_function_single(int cpuid, smp_queue_func_t func,
+ struct queue_single_data *qsd);
* Call a function on all processors
diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..bfe7b36 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -265,6 +265,50 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)

+void generic_smp_queue_function_single_interrupt(void *info)
+ struct queue_single_data *qsd = info;
+ WARN_ON_ONCE(xchg(&qsd->pending, 0) != 1);
+ qsd->func(qsd);
+ * smp_queue_function_single - Queue an asynchronous function to run on a
+ * specific CPU unless it's already pending.
+ * @func: The function to run. This must be fast and non-blocking.
+ * @qsd: The data contained in the interested object if any
+ *
+ * Like smp_call_function_single_async() but the call to @func is serialized
+ * and won't be queued if it is already pending. In the latter case, ordering
+ * is still guaranteed such that the pending call will sees the new data we
+ * expect it to.
+ *
+ * This must not be called on offline CPUs.
+ *
+ * Returns 0 when @func is successfully queued or already pending, else a negative
+ * status code.
+ */
+int smp_queue_function_single(int cpu, smp_queue_func_t func, struct queue_single_data *qsd)
+ int err;
+ if (cmpxchg(&qsd->pending, 0, 1))
+ return 0;
+ qsd->func = func;
+ preempt_disable();
+ err = generic_exec_single(cpu, &qsd->data, generic_smp_queue_function_single_interrupt, qsd, 0);
+ preempt_enable();
+ /* Reset is case of error. This must not be called on offline CPUs */
+ if (err)
+ qsd->pending = 0;
+ return err;
* smp_call_function_any - Run a function on any of the given cpus
* @mask: The mask of cpus it can run on.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/