Re: [PATCH] lkdtm/bugs: add test for hung smp_call_function_single()

From: Kees Cook
Date: Tue Apr 23 2024 - 13:22:25 EST


On Tue, Apr 23, 2024 at 10:47:29AM +0100, Mark Rutland wrote:
> On Fri, Apr 19, 2024 at 02:53:59PM -0700, Kees Cook wrote:
> > On Fri, Apr 19, 2024 at 11:34:52AM +0100, Mark Rutland wrote:
> > > The CONFIG_CSD_LOCK_WAIT_DEBUG option enables debugging of hung
> > > smp_call_function*() calls (e.g. when the target CPU gets stuck within
> > > the callback function). Testing this option requires triggering such
> > > hangs.
> > >
> > > This patch adds an lkdtm test with a hung smp_call_function_single()
> > > callbac, which can be used to test CONFIG_CSD_LOCK_WAIT_DEBUG and NMI
> > > backtraces (as CONFIG_CSD_LOCK_WAIT_DEBUG will attempt an NMI backtrace
> > > of the hung target CPU).
>
> [...]
>
> > > I wrote this because I needed to guide someone through debugging a hung
> > > smp_call_function() call, and I needed examples with/without an NMI
> > > backtrace. It seems like it'd be useful for testing the CSD lockup
> > > detector and NMI backtrace code in future.
> >
> > Like the other lockup detector, I suspect we should skip it by default
> > in the selftests? Something like this:
> >
> > diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
> > index 368973f05250..32baddc2c85d 100644
> > --- a/tools/testing/selftests/lkdtm/tests.txt
> > +++ b/tools/testing/selftests/lkdtm/tests.txt
> > @@ -31,6 +31,7 @@ SLAB_FREE_CROSS
> > SLAB_FREE_PAGE
> > #SOFTLOCKUP Hangs the system
> > #HARDLOCKUP Hangs the system
> > +#CSDLOCKUP Hangs the system
> > #SPINLOCKUP Hangs the system
> > #HUNG_TASK Hangs the system
> > EXEC_DATA
>
> Ah, I wasn't ware of that file, yes.
>
> > > I'm not sure about the CSDLOCKUP name, but everything else I tried
> > > didn't seem great either:
> > >
> > > * IPILOCKUP sounds like it's testing IPIs generally
> > > * SMPCALLLOCKUP and similar look weirdly long
> > > * SMP_CALL_LOCKUP and similar look different to {HARD,SOFT,SPIN}LOCKUP
> > >
> > > ... and I'm happy to defer to Kees for the naming. ;)
> >
> > It looks like it's only useful with CSD lockup detector? If that's true,
> > sure, this name is fine.
>
> I think it's also useful for testing other things (e.g. RCU stall detection),
> so how about we go with SMP_CALL_LOCKUP, as that says what the test does rather
> than what specifically it can be used to test?

Yeah, that works for me. Thanks!

-Kees

--
Kees Cook