Re: [PATCH] smp: Allow smp_call_function_single_async() to insert locked csd

From: Peter Zijlstra
Date: Wed Dec 11 2019 - 10:41:20 EST


On Wed, Dec 04, 2019 at 03:48:23PM -0500, Peter Xu wrote:
> Previously we will raise an warning if we want to insert a csd object
> which is with the LOCK flag set, and if it happens we'll also wait for
> the lock to be released. However, this operation does not match
> perfectly with how the function is named - the name with "_async"
> suffix hints that this function should not block, while we will.
>
> This patch changed this behavior by simply return -EBUSY instead of
> waiting, at the meantime we allow this operation to happen without
> warning the user to change this into a feature when the caller wants
> to "insert a csd object, if it's there, just wait for that one".
>
> This is pretty safe because in flush_smp_call_function_queue() for
> async csd objects (where csd->flags&SYNC is zero) we'll first do the
> unlock then we call the csd->func(). So if we see the csd->flags&LOCK
> is true in smp_call_function_single_async(), then it's guaranteed that
> csd->func() will be called after this smp_call_function_single_async()
> returns -EBUSY.
>
> Update the comment of the function too to refect this.
>
> CC: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CC: Nadav Amit <namit@xxxxxxxxxx>
> CC: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> CC: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CC: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> ---
>
> The story starts from a test where we've encountered the WARN_ON() on
> a customized kernel and the csd_wait() took merely forever to
> complete (so we've got a WARN_ON plusing a hang host). The current
> solution (which is downstream-only for now) is that from the caller's
> side we use a boolean to store whether the csd is executed, we do:
>
> if (atomic_cmpxchg(&in_progress, 0, 1))
> smp_call_function_single_async(..);
>
> While at the end of csd->func() we clear the bit. However imho that's
> mostly what csd->flags&LOCK is doing. So I'm thinking maybe it would
> worth it to raise this patch for upstream too so that it might help
> other users of smp_call_function_single_async() when they need the
> same semantic (and, I do think we shouldn't wait in _async()s...)

hrtick_start() seems to employ something similar.

> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>

duplicate tag :-)

> ---
> kernel/smp.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 7dbcb402c2fc..dd31e8228218 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -329,6 +329,11 @@ EXPORT_SYMBOL(smp_call_function_single);
> * (ie: embedded in an object) and is responsible for synchronizing it
> * such that the IPIs performed on the @csd are strictly serialized.
> *
> + * If the function is called with one csd which has not yet been
> + * processed by previous call to smp_call_function_single_async(), the
> + * function will return immediately with -EBUSY showing that the csd
> + * object is still in progress.
> + *
> * NOTE: Be careful, there is unfortunately no current debugging facility to
> * validate the correctness of this serialization.
> */
> @@ -338,14 +343,17 @@ int smp_call_function_single_async(int cpu, call_single_data_t *csd)
>
> preempt_disable();
>
> - /* We could deadlock if we have to wait here with interrupts disabled! */
> - if (WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK))
> - csd_lock_wait(csd);
> + if (csd->flags & CSD_FLAG_LOCK) {
> + err = -EBUSY;
> + goto out;
> + }
>
> csd->flags = CSD_FLAG_LOCK;
> smp_wmb();
>
> err = generic_exec_single(cpu, csd, csd->func, csd->info);
> +
> +out:
> preempt_enable();
>
> return err;

Yes.. I think this will work.

I worry though; usage such as in __blk_mq_complete_request() /
raise_blk_irq(), which explicitly clears csd.flags before calling it
seems more dangerous than usual.

liquidio_napi_drv_callback() does that same.