Re: [PATCH] lkdtm/bugs: add test for hung smp_call_function_single()
From: Mark Rutland
Date: Tue Apr 23 2024 - 05:47:45 EST
On Fri, Apr 19, 2024 at 02:53:59PM -0700, Kees Cook wrote:
> On Fri, Apr 19, 2024 at 11:34:52AM +0100, Mark Rutland wrote:
> > The CONFIG_CSD_LOCK_WAIT_DEBUG option enables debugging of hung
> > smp_call_function*() calls (e.g. when the target CPU gets stuck within
> > the callback function). Testing this option requires triggering such
> > hangs.
> >
> > This patch adds an lkdtm test with a hung smp_call_function_single()
> > callbac, which can be used to test CONFIG_CSD_LOCK_WAIT_DEBUG and NMI
> > backtraces (as CONFIG_CSD_LOCK_WAIT_DEBUG will attempt an NMI backtrace
> > of the hung target CPU).
[...]
> > I wrote this because I needed to guide someone through debugging a hung
> > smp_call_function() call, and I needed examples with/without an NMI
> > backtrace. It seems like it'd be useful for testing the CSD lockup
> > detector and NMI backtrace code in future.
>
> Like the other lockup detector, I suspect we should skip it by default
> in the selftests? Something like this:
>
> diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
> index 368973f05250..32baddc2c85d 100644
> --- a/tools/testing/selftests/lkdtm/tests.txt
> +++ b/tools/testing/selftests/lkdtm/tests.txt
> @@ -31,6 +31,7 @@ SLAB_FREE_CROSS
> SLAB_FREE_PAGE
> #SOFTLOCKUP Hangs the system
> #HARDLOCKUP Hangs the system
> +#CSDLOCKUP Hangs the system
> #SPINLOCKUP Hangs the system
> #HUNG_TASK Hangs the system
> EXEC_DATA
Ah, I wasn't ware of that file, yes.
> > I'm not sure about the CSDLOCKUP name, but everything else I tried
> > didn't seem great either:
> >
> > * IPILOCKUP sounds like it's testing IPIs generally
> > * SMPCALLLOCKUP and similar look weirdly long
> > * SMP_CALL_LOCKUP and similar look different to {HARD,SOFT,SPIN}LOCKUP
> >
> > ... and I'm happy to defer to Kees for the naming. ;)
>
> It looks like it's only useful with CSD lockup detector? If that's true,
> sure, this name is fine.
I think it's also useful for testing other things (e.g. RCU stall detection),
so how about we go with SMP_CALL_LOCKUP, as that says what the test does rather
than what specifically it can be used to test?
Mark.