Re: [PATCH v4 1/2] nmi_backtrace: Allow excluding an arbitrary CPU

From: Chen-Yu Tsai
Date: Mon Aug 07 2023 - 03:35:46 EST


On Fri, Aug 4, 2023 at 10:01 PM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> The APIs that allow backtracing across CPUs have always had a way to
> exclude the current CPU. This convenience means callers didn't need to
> find a place to allocate a CPU mask just to handle the common case.
>
> Let's extend the API to take a CPU ID to exclude instead of just a
> boolean. This isn't any more complex for the API to handle and allows
> the hardlockup detector to exclude a different CPU (the one it already
> did a trace for) without needing to find space for a CPU mask.
>
> Arguably, this new API also encourages safer behavior. Specifically if
> the caller wants to avoid tracing the current CPU (maybe because they
> already traced the current CPU) this makes it more obvious to the
> caller that they need to make sure that the current CPU ID can't
> change.
>
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> ---
>
> Changes in v4:
> - Renamed trigger_allbutself_cpu_backtrace() for when trigger is unsupported.
>
> Changes in v3:
> - ("nmi_backtrace: Allow excluding an arbitrary CPU") new for v3.
>
> arch/arm/include/asm/irq.h | 2 +-
> arch/arm/kernel/smp.c | 4 ++--
> arch/loongarch/include/asm/irq.h | 2 +-
> arch/loongarch/kernel/process.c | 4 ++--
> arch/mips/include/asm/irq.h | 2 +-
> arch/mips/kernel/process.c | 4 ++--
> arch/powerpc/include/asm/irq.h | 2 +-
> arch/powerpc/kernel/stacktrace.c | 4 ++--
> arch/powerpc/kernel/watchdog.c | 4 ++--
> arch/sparc/include/asm/irq_64.h | 2 +-
> arch/sparc/kernel/process_64.c | 6 +++---
> arch/x86/include/asm/irq.h | 2 +-
> arch/x86/kernel/apic/hw_nmi.c | 4 ++--
> include/linux/nmi.h | 14 +++++++-------
> kernel/watchdog.c | 2 +-
> lib/nmi_backtrace.c | 6 +++---
> 16 files changed, 32 insertions(+), 32 deletions(-)
>

[...]

> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index e3e6a64b98e0..7cf7801856a1 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -157,31 +157,31 @@ static inline void touch_nmi_watchdog(void)
> #ifdef arch_trigger_cpumask_backtrace
> static inline bool trigger_all_cpu_backtrace(void)
> {
> - arch_trigger_cpumask_backtrace(cpu_online_mask, false);
> + arch_trigger_cpumask_backtrace(cpu_online_mask, -1);
> return true;
> }
>
> -static inline bool trigger_allbutself_cpu_backtrace(void)
> +static inline bool trigger_allbutcpu_cpu_backtrace(int exclude_cpu)
> {
> - arch_trigger_cpumask_backtrace(cpu_online_mask, true);
> + arch_trigger_cpumask_backtrace(cpu_online_mask, exclude_cpu);
> return true;
> }
>
> static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
> {
> - arch_trigger_cpumask_backtrace(mask, false);
> + arch_trigger_cpumask_backtrace(mask, -1);
> return true;
> }
>
> static inline bool trigger_single_cpu_backtrace(int cpu)
> {
> - arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
> + arch_trigger_cpumask_backtrace(cpumask_of(cpu), -1);
> return true;
> }
>
> /* generic implementation */
> void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
> - bool exclude_self,
> + int exclude_cpu,
> void (*raise)(cpumask_t *mask));
> bool nmi_cpu_backtrace(struct pt_regs *regs);
>
> @@ -190,7 +190,7 @@ static inline bool trigger_all_cpu_backtrace(void)
> {
> return false;
> }
> -static inline bool trigger_allbutself_cpu_backtrace(void)
> +static inline bool trigger_allbutcpu_cpu_backtrace(void)
^
The parameter here is still wrong. It should be "int exclude_cpu".

This patch in Andrew's queue is causing build errors on next-20230807 on arm64:

kernel/watchdog.c: In function ‘watchdog_timer_fn’:
kernel/watchdog.c:521:25: error: too many arguments to function
‘trigger_allbutcpu_cpu_backtrace’
521 |
trigger_allbutcpu_cpu_backtrace(smp_processor_id());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from kernel/watchdog.c:17:
./include/linux/nmi.h:193:20: note: declared here
193 | static inline bool trigger_allbutcpu_cpu_backtrace(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[3]: *** [scripts/Makefile.build:243: kernel/watchdog.o] Error 1


ChenYu