Re: [PATCH] kunit: fix Kconfig for build-in tests USB4 and Nitro Enclaves

From: Nico Pache
Date: Sat Aug 13 2022 - 08:04:27 EST


On Fri, Aug 12, 2022 at 2:43 AM David Gow <davidgow@xxxxxxxxxx> wrote:
>
> (+joefradley@xxxxxxxxxx to comment on what Android is doing here)
>
> On Thu, Aug 11, 2022 at 8:44 PM Nico Pache <npache@xxxxxxxxxx> wrote:
> >
> > On Wed, Aug 10, 2022 at 8:20 PM David Gow <davidgow@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Aug 11, 2022 at 7:41 AM Nico Pache <npache@xxxxxxxxxx> wrote:
> > > >
> > > > Both the USB4 and Nitro Enclaves KUNIT tests are now able to be compiled
> > > > if KUNIT is compiled as a module. This leads to issues if KUNIT is being
> > > > packaged separately from the core kernel and when KUNIT is run baremetal
> > > > without the required driver compiled into the kernel.
> > > >
> > > > Fixes: 635dcd16844b ("thunderbolt: test: Use kunit_test_suite() macro")
> > > > Fixes: fe5be808fa6c ("nitro_enclaves: test: Use kunit_test_suite() macro")
> > > > Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
> > > > ---
> > >
> > > Hmm... I'm not quite sure I understand the case that's broken here. Is it:
> > > - KUnit is built as a module (CONFIG_KUNIT=m)
> > > - USB4/nitro_enclaves are also built as modules, with the test enabled.
> > > - The kunit module is not available at runtime, so neither driver
> > > module can load (due to missing kunit dependencies)
> > Exactly, except the issue is also when the USB/NE=y not just when they
> > are modules. This is currently creating an issue with our build system
> > during the depmod stage and has been preventing us from generating
> > Fedora builds.
> .
> Yeah: there's a nasty tradeoff here in that having these depend on
> KUNIT=y does (obviously) mean that it's not possible to run these
> tests with KUNIT=m. I'd agree that being able to ruin some tests is
> better than none, but there are quite a lot of tests which are doing
> the same sort of tricks as USB4/nitro_enclaves to embed tests in the
> same module as the code being tested. In particular, I think apparmor
> is doing something similar, and the incoming AMDGPU tests also build
> all of the tests into amdgpu.ko. If we require KUNIT=y for these,
> we're leaving a lot of tests on the table for KUNIT=m cases, which
> would otherwise work.
>
> The ideal solution would be to split the tests for these systems out
> into their own separate modules, but that's often quite tricky due to
> the sheer number of otherwise internal symbols which need exporting.

I have already started trying to separate out all the built-in tests.
There are a few which may have to stay as built-in if I can't find a
viable solution, but expect a patch-set doing just this soon :)

> > >
> > > If so, that's not a case (i.e., the kunit.ko module being unavailable
> > > if it was built) we've tried to support thus far. I guess a de-facto
> > > rule for supporting it would be to depend on KUNIT=y for any KUnit
> > > tests which are built into the same module as the driver they're
> > > testing.
> > Yeah, although it's not been a case you've been trying to support, it
> > has been working so far :) This has been the case (built-in tests
> > utilizing 'depends on KUNIT=y') since we began supporting KUNIT in our
> > testing infrastructure and it would be nice to keep that as a de-facto
> > rule :)
>
> Okay: let's try to stick with that for now, then (unless there are any
> objections from the people working on those particular tests), and
> look to either reinstate it if we find a better way of dealing with
> the missing/disabled kunit.ko case, or the tests can be split into a
> separate module. Personally, I don't expect we'll get either of those
> working in the short-term, but it's definitely a problem we'll have to
> confront more eventually.
>
> In the meantime, I think the KUnit position on this will be to note
> this as a consequence of building KUnit tests into bigger modules, and
> leave the final decision up to the maintainers of those
> subsystems/tests. This may result in there being some tests you have
> to explicitly disable (rather than being able to use KUNIT_ALL_TESTS)
> if an important module decides that they really want their tests to
> run when KUNIT=m (which may not happen, we'll see...)
Yep, explicitly disabling them is also an option for us, but I
submitted this patchset because I noticed a change in the pattern for
built-in tests (depends on KUNIT=y). So if people object to some tests
not having kunit module support, we do have the ability to work around
it.

Thanks for your review!

Cheers,
-- Nico

>
> > >
> > > Alternatively, maybe we could do some horrible hacks to compile stub
> > > versions of various KUnit assertion symbols in unconditionally, which
> > > forward to the real ones if KUnit is available.
> > >
> > > (Personally, I'd love it if we could get rid of CONFIG_KUNIT=m
> > > altogether, and it's actually broken right at the moment[1]. There are
> > > still some cases (unloading / reloading KUnit with different filter
> > > options) which require it, though.)
> > Personally I'd hate to see KUNIT=m go as that is how we have been able
> > to support running Kunit tests so far.
> >
> > A little background on how we utilize Kunit. We build with KUNIT=m and
> > KUNIT_ALL_TESTS=m and run the tests baremetal.
> > Our build system creates 3 packages (kernel, kernel-modules, and
> > kernel-modules-internal), this allows us to ship the kernel and its
> > modules, while also isolating packages that we dont want to ship
> > outside of QE and developers. We then have our own infrastructure in
> > place to run and collect the output of these tests in our pipelined
> > environments. We dont utilize UML because we dont support that feature
> > in RHEL.
> >
> > Fedora uses this same methodology for running KUNIT, so we are
> > frequently running kunit on an 'upstream' variant.
> >
> > I'm not sure how many organizations are supporting continuous KUNIT
> > testing, or how they are achieving it, but dropping module support
> > would prevent us from doing the CI testing we have in place.
> >
> > Cheers!
> > -- Nico
>
> Fair enough -- we definitely won't get rid of it unless there's a
> replacement which works as well if not better.
>
> The reason it's tempting to get rid of KUNIT=m is simply that there's
> a chunk of KUnit code which needs to be built-in, even if the rest of
> it is in a module. So a kernel with KUNIT=m still has a fair bit of
> the overhead of KUNIT=y, and this is likely to get more significant as
> more such features land (e.g., static stubbing:
> https://lore.kernel.org/lkml/20220318021314.3225240-2-davidgow@xxxxxxxxxx/
> ).
>
> Traditionally, our expectation has been that a separate, KUnit-enabled
> kernel config / build makes sense, as that allows the
> release/production build to run without any testing-related overheads
> at all. That being said, I know Android are looking to enable KUnit in
> all GKI builds, and to implement a separate kunit.enable option to
> effectively "disable" it at runtime. This doesn't remove all of the
> overhead, but does allow KUnit to always be present without the risk
> of compromising the integrity of the running kernel by running tests
> in production.
>
> Regardless of whether any of those seem interesting to you, we won't
> be getting rid of KUNIT=m in the short-term (and definitely will be
> supporting individual test modules, even if we later want to have the
> core executor built-in).
>
> One other note is that KUNIT=m is actually broken right at the moment:
> the fix is here:
> https://patchwork.kernel.org/project/linux-kselftest/patch/20220713005221.1926290-1-davidgow@xxxxxxxxxx/
>
> Cheers,
> -- David
>