Re: Shared scaffolding to unit-test kernel code in userspace (was Re: [Intel-gfx] [PATCH] drm/i915: Add link training test)

From: Ander Conselvan De Oliveira
Date: Tue Sep 29 2015 - 04:48:02 EST


On Wed, 2015-09-23 at 11:38 +0200, Daniel Vetter wrote:
> On Wed, Sep 23, 2015 at 11:18 AM, Ander Conselvan De Oliveira <conselvan2@xxxxxxxxx> wrote:
> > On Wed, 2015-09-23 at 10:24 +0200, Daniel Vetter wrote:
> > > On Tue, Sep 15, 2015 at 04:08:53PM +0300, Ander Conselvan De Oliveira wrote:
> > > > On Mon, 2015-09-14 at 16:38 +0300, Ander Conselvan De Oliveira wrote:
> > > > > On Mon, 2015-09-14 at 15:11 +0200, Daniel Vetter wrote:
> > > > > > On Mon, Sep 14, 2015 at 02:51:51PM +0300, Ander Conselvan de Oliveira wrote:
> > > > > > > ---
> > > > > > >
> > > > > > > On Fri, 2015-09-11 at 17:11 +0300, Ander Conselvan de Oliveira wrote:
> > > > > > > > On Wed, 2015-09-09 at 11:33 +0100, Thomas Wood wrote:
> > > > > > > > > On 8 September 2015 at 13:28, Ander Conselvan de Oliveira
> > > > > > > > > <ander.conselvan.de.oliveira@xxxxxxxxx> wrote:
> > > > > > > > > >
> > > > > > > >
> > > > > > > > > > diff --git a/link-training-test/Makefile b/link-training-test/Makefile
> > > > > > > > >
> > > > > > > > > If this is meant to be part of the test suite, then it needs to be in
> > > > > > > > > the tests directory and use the igt test infrastructure. Otherwise it
> > > > > > > > > should be placed in tools or tools/link-training-test.
> > > > > > > >
> > > > > > > > I made the test use the igt infrastructure, but I'm not sure if this is
> > > > > > > > a good fit for it. The dependency on the kernel is on build time, but
> > > > > > > > once compiled this can be run on any machine. This can also introduce
> > > > > > > > build failures if the test is not kept in sync with the driver source.
> > > > > > > > Ideally that a failure to build this would be reported as the test
> > > > > > > > failing, but I have no idea of how to achieve that.
> > > > > > >
> > > > > > > Alternatively, this could be in the kernel source tree directly. This
> > > > > > > patch adds a test subdir to the i915 source dir, containing the link
> > > > > > > training test. The test is compiled as part of the normal build using
> > > > > > > the extra-y variable so that it doesn't get linked to the final kernel.
> > > > > > >
> > > > > > > When make is run from the tests directory, a thin wrapper around the
> > > > > > > tests is built and linked to the object file compiled as part of the
> > > > > > > kernel build. Running make run_tests from the test dir runs the test
> > > > > > > and reports success or failure.
> > > > > > >
> > > > > > > Any thoughts?
> > > > > >
> > > > > > I think there's some precedence in other subsystems to integrate unit
> > > > > > tests directly in the kernel, e.g. locking selftest or similar things.
> > > > > > Usual approach is to either have a special module (but that often means
> > > > > > piles of EXPORT_SYMBOL only for that selftest module). Or just a y/n
> > > > > > Kconfig option which enables that code and runs all the self/unit tests
> > > > > > when the module loads.
> > > > > >
> > > > > > I'd go with that approach since it's simpler. And we'd only need to tell
> > > > > > QA to enable that Kconfig option for more testing.
> > > > >
> > > > > I'll have a look into that Kconfig approach, but there's a couple of things
> > > > > I like about having the unit test as user space binaries:
> > > > >
> > > > > - there's no need to boot the newly compiled kernel, so doing a test run
> > > > > is super fast;
> > > > > - the binaries can be debugged with gdb just like other user space stuff.
> > > >
> > > > I implemented the test using the Kconfig approach, and it seems to work well
> > > > without impacting the points above. I added the call to run the test as the
> > > > first thing in i915_init(), and with the driver compiled built-in, running
> > > > the kernel under qemu will run the tests. And qemu can also provide a gdb
> > > > remote target.
> > > >
> > > > One thing might be a problem though. With the previous approach, the
> > > > functions overriden by the test where simply reimplemented in the new binary.
> > > > But now the test is linked to the entire driver, so that's not possible. To
> > > > work around that, I had to add function pointers to all the functions called
> > > > by the link training state machine to intel_dp. I don't think that method
> > > > scales well.
> > > >
> > > > I'll send update patches for reference as replies to this mail.
> > >
> > > I had a few discussions about this at XDC and I know think doing this is
> > > userspace is better:
> > > - Faster to run tests (since no module reloading required).
> > > - Nouveau is developed in userspace and would like to reuse shared link
> > > training code too if possible from the dp helpers.
> > >
> > > So I'm leaning towards scaffolding in userspace now. Might be good to
> > > check out how the nouveau userspace runtime works just to steal a few
> > > tricks.
> >
> > I chatted with Martin Peres about this and had a look at the code. There are a few interesting
> > tricks in Nouveau indeed. They have user space implementation for some kernel internals such as
> > mutexes, ioremap on top of libpciaccess, etc. and an extensive set of stubs.
> >
> > The one thing that wouldn't be easy to take advantage of is the build integration. Their
> > upstream
> > repository is not actually a kernel tree. It contains their drm code, the same that ends up in
> > the
> > kernel plus all the user space stuff. There's a script that converts the commits from that
> > repository for inclusion in the kernel tree.
>
> Could we perhaps move at least some of the scafffolding for kernel
> functions to upstream? I guess nouveau has some means to pull changes from
> upstream too, so maybe we could push that some place nice. And I'm pretty
> sure we're not the only ones thinking about unit-testing specific
> kernelcode in a userspace environment.

Maybe this would add a burden on nouveau development without any added benefit? Since we need only a
small subset of what they already have, perhaps it would make more sense to first replicate that
small part for our consumption. I believe what need to be re-implemented and/or stubbed will vary
significantly with the code being tested, so it might make more sense to just let it grow
organically.

I think the more important question is how to include the right code when compiling the user space
binaries. The trick nouveau uses is to have only one header that includes code from the kernel. That
header is replaced with a version intended for user space if compiling user space binaries. We would
have to use the same trick for every subsystem under test, including some include path mangling. At
least drm and i915 would be affected for the tests I'm proposing.

Ander


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/