Re: [PATCH v4 1/9] smp: Run functions concurrently in smp_call_function_many()
From: Nadav Amit
Date: Mon Aug 26 2019 - 12:26:09 EST
> On Aug 23, 2019, at 3:41 PM, Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> Currently, on_each_cpu() and similar functions do not exploit the
> potential of concurrency: the function is first executed remotely and
> only then it is executed locally. Functions such as TLB flush can take
> considerable time, so this provides an opportunity for performance
> optimization.
>
> To do so, introduce __smp_call_function_many(), which allows the callers
> to provide local and remote functions that should be executed, and run
> them concurrently. Keep smp_call_function_many() semantic as it is today
> for backward compatibility: the called function is not executed in this
> case locally.
>
> __smp_call_function_many() does not use the optimized version for a
> single remote target that smp_call_function_single() implements. For
> synchronous function call, smp_call_function_single() keeps a
> call_single_data (which is used for synchronization) on the stack.
> Interestingly, it seems that not using this optimization provides
> greater performance improvements (greater speedup with a single remote
> target than with multiple ones). Presumably, holding data structures
> that are intended for synchronization on the stack can introduce
> overheads due to TLB misses and false-sharing when the stack is used for
> other purposes.
>
> Adding support to run the functions concurrently required to remove a
> micro-optimization in on_each_cpu() that disabled/enabled IRQs instead
> of saving/restoring them. The benefit of running the local and remote
> code concurrently is expected to be greater.
>
> Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
> ---
> include/linux/smp.h | 34 ++++++++---
> kernel/smp.c | 138 +++++++++++++++++++++-----------------------
> 2 files changed, 92 insertions(+), 80 deletions(-)
>
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 6fc856c9eda5..d18d54199635 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -32,11 +32,6 @@ extern unsigned int total_cpus;
> int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
> int wait);
>
> -/*
> - * Call a function on all processors
> - */
> -void on_each_cpu(smp_call_func_t func, void *info, int wait);
> -
> /*
> * Call a function on processors specified by mask, which might include
> * the local one.
> @@ -44,6 +39,17 @@ void on_each_cpu(smp_call_func_t func, void *info, int wait);
> void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
> void *info, bool wait);
>
> +/*
> + * Call a function on all processors. May be used during early boot while
> + * early_boot_irqs_disabled is set.
> + */
> +static inline void on_each_cpu(smp_call_func_t func, void *info, int wait)
> +{
> + preempt_disable();
> + on_each_cpu_mask(cpu_online_mask, func, info, wait);
> + preempt_enable();
> +}
Err.. I made this change the last minute before sending, and apparently
forgot to build, since it does not build.
Let me know if there is anything else with this version, though.