Re: [PATCH] kunit: fix Kconfig for build-in tests USB4 and Nitro Enclaves

From: Joe Fradley
Date: Fri Aug 12 2022 - 12:16:03 EST


On Thu, Aug 11, 2022 at 11:43 PM David Gow <davidgow@xxxxxxxxxx> wrote:
>
> (+joefradley@xxxxxxxxxx to comment on what Android is doing here)
>
> On Thu, Aug 11, 2022 at 8:44 PM Nico Pache <npache@xxxxxxxxxx> wrote:
> >
> > On Wed, Aug 10, 2022 at 8:20 PM David Gow <davidgow@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Aug 11, 2022 at 7:41 AM Nico Pache <npache@xxxxxxxxxx> wrote:
> > > >
> > > > Both the USB4 and Nitro Enclaves KUNIT tests are now able to be compiled
> > > > if KUNIT is compiled as a module. This leads to issues if KUNIT is being
> > > > packaged separately from the core kernel and when KUNIT is run baremetal
> > > > without the required driver compiled into the kernel.
> > > >
> > > > Fixes: 635dcd16844b ("thunderbolt: test: Use kunit_test_suite() macro")
> > > > Fixes: fe5be808fa6c ("nitro_enclaves: test: Use kunit_test_suite() macro")
> > > > Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
> > > > ---
> > >
> > > Hmm... I'm not quite sure I understand the case that's broken here. Is it:
> > > - KUnit is built as a module (CONFIG_KUNIT=m)
> > > - USB4/nitro_enclaves are also built as modules, with the test enabled.
> > > - The kunit module is not available at runtime, so neither driver
> > > module can load (due to missing kunit dependencies)
> > Exactly, except the issue is also when the USB/NE=y not just when they
> > are modules. This is currently creating an issue with our build system
> > during the depmod stage and has been preventing us from generating
> > Fedora builds.
> .
> Yeah: there's a nasty tradeoff here in that having these depend on
> KUNIT=y does (obviously) mean that it's not possible to run these
> tests with KUNIT=m. I'd agree that being able to ruin some tests is
> better than none, but there are quite a lot of tests which are doing
> the same sort of tricks as USB4/nitro_enclaves to embed tests in the
> same module as the code being tested. In particular, I think apparmor
> is doing something similar, and the incoming AMDGPU tests also build
> all of the tests into amdgpu.ko. If we require KUNIT=y for these,
> we're leaving a lot of tests on the table for KUNIT=m cases, which
> would otherwise work.
>
> The ideal solution would be to split the tests for these systems out
> into their own separate modules, but that's often quite tricky due to
> the sheer number of otherwise internal symbols which need exporting.
>
> > >
> > > If so, that's not a case (i.e., the kunit.ko module being unavailable
> > > if it was built) we've tried to support thus far. I guess a de-facto
> > > rule for supporting it would be to depend on KUNIT=y for any KUnit
> > > tests which are built into the same module as the driver they're
> > > testing.
> > Yeah, although it's not been a case you've been trying to support, it
> > has been working so far :) This has been the case (built-in tests
> > utilizing 'depends on KUNIT=y') since we began supporting KUNIT in our
> > testing infrastructure and it would be nice to keep that as a de-facto
> > rule :)
>
> Okay: let's try to stick with that for now, then (unless there are any
> objections from the people working on those particular tests), and
> look to either reinstate it if we find a better way of dealing with
> the missing/disabled kunit.ko case, or the tests can be split into a
> separate module. Personally, I don't expect we'll get either of those
> working in the short-term, but it's definitely a problem we'll have to
> confront more eventually.
>
> In the meantime, I think the KUnit position on this will be to note
> this as a consequence of building KUnit tests into bigger modules, and
> leave the final decision up to the maintainers of those
> subsystems/tests. This may result in there being some tests you have
> to explicitly disable (rather than being able to use KUNIT_ALL_TESTS)
> if an important module decides that they really want their tests to
> run when KUNIT=m (which may not happen, we'll see...)
>
> > >
> > > Alternatively, maybe we could do some horrible hacks to compile stub
> > > versions of various KUnit assertion symbols in unconditionally, which
> > > forward to the real ones if KUnit is available.
> > >
> > > (Personally, I'd love it if we could get rid of CONFIG_KUNIT=m
> > > altogether, and it's actually broken right at the moment[1]. There are
> > > still some cases (unloading / reloading KUnit with different filter
> > > options) which require it, though.)
> > Personally I'd hate to see KUNIT=m go as that is how we have been able
> > to support running Kunit tests so far.
> >
> > A little background on how we utilize Kunit. We build with KUNIT=m and
> > KUNIT_ALL_TESTS=m and run the tests baremetal.
> > Our build system creates 3 packages (kernel, kernel-modules, and
> > kernel-modules-internal), this allows us to ship the kernel and its
> > modules, while also isolating packages that we dont want to ship
> > outside of QE and developers. We then have our own infrastructure in
> > place to run and collect the output of these tests in our pipelined
> > environments. We dont utilize UML because we dont support that feature
> > in RHEL.
> >
> > Fedora uses this same methodology for running KUNIT, so we are
> > frequently running kunit on an 'upstream' variant.
> >
> > I'm not sure how many organizations are supporting continuous KUNIT
> > testing, or how they are achieving it, but dropping module support
> > would prevent us from doing the CI testing we have in place.
> >
> > Cheers!
> > -- Nico
>
> Fair enough -- we definitely won't get rid of it unless there's a
> replacement which works as well if not better.
>
> The reason it's tempting to get rid of KUNIT=m is simply that there's
> a chunk of KUnit code which needs to be built-in, even if the rest of
> it is in a module. So a kernel with KUNIT=m still has a fair bit of
> the overhead of KUNIT=y, and this is likely to get more significant as
> more such features land (e.g., static stubbing:
> https://lore.kernel.org/lkml/20220318021314.3225240-2-davidgow@xxxxxxxxxx/
> ).
>
> Traditionally, our expectation has been that a separate, KUnit-enabled
> kernel config / build makes sense, as that allows the
> release/production build to run without any testing-related overheads
> at all. That being said, I know Android are looking to enable KUnit in
> all GKI builds, and to implement a separate kunit.enable option to
> effectively "disable" it at runtime. This doesn't remove all of the
> overhead, but does allow KUnit to always be present without the risk
> of compromising the integrity of the running kernel by running tests
> in production.

Like David mentioned, internally for GKI we have KUNIT=y with running
built-in tests permanently disabled and only allowing module test
execution if a kernel command line option (kunit.enable) is set. I
hope to have an upstream patch of this for review soon. If you're
willing to have the extra KUnit overhead in your production build,
this could be an option for you as well.

>
> Regardless of whether any of those seem interesting to you, we won't
> be getting rid of KUNIT=m in the short-term (and definitely will be
> supporting individual test modules, even if we later want to have the
> core executor built-in).
>
> One other note is that KUNIT=m is actually broken right at the moment:
> the fix is here:
> https://patchwork.kernel.org/project/linux-kselftest/patch/20220713005221.1926290-1-davidgow@xxxxxxxxxx/
>
> Cheers,
> -- David