Re: [PATCH] Documentation: kunit: provide guidance for testing many inputs

From: David Gow
Date: Fri Nov 06 2020 - 23:21:50 EST


On Tue, Nov 3, 2020 at 5:37 AM Daniel Latypov <dlatypov@xxxxxxxxxx> wrote:
>
> usage.rst goes into a detailed about faking out classes, but currently

Nit: a detailed what?

> lacks wording about how one might idiomatically test a range of inputs.
>
> Give an example of how one might test a hash function via macros/helper
> funcs and a table-driven test and very briefly discuss pros and cons.
>
> Also highlight the KUNIT_EXPECT_*_MSG() variants (that aren't mentioned
> elsewhere [1]) which are particularly useful in these situations.
>
> It is also criminally underused at the moment, only appearing in 2
> tests (both written by people involved in KUnit).
>
> [1] not even on
> https://www.kernel.org/doc/html/latest/dev-tools/kunit/api/test.html

I suspect we'll eventually want to document the _MSG variants here as
well, though it will bloat the page somewhat. In any case, it can be
left to a separate patch.

>
> Signed-off-by: Daniel Latypov <dlatypov@xxxxxxxxxx>
> ---

Thanks for writing this -- it's definitely a common test pattern which
it'd be nice to encourage and explain a bit better.

Cheers,
-- David

> Documentation/dev-tools/kunit/usage.rst | 66 +++++++++++++++++++++++++
> 1 file changed, 66 insertions(+)
>
> diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
> index 62142a47488c..317390df2b96 100644
> --- a/Documentation/dev-tools/kunit/usage.rst
> +++ b/Documentation/dev-tools/kunit/usage.rst
> @@ -451,6 +451,72 @@ We can now use it to test ``struct eeprom_buffer``:
> destroy_eeprom_buffer(ctx->eeprom_buffer);
> }
>
> +Testing various inputs
> +----------------------
Nit: "various" isn't hugely descriptive here. Maybe something like
"Testing against multiple inputs" would be better?

> +
> +Testing just a few inputs might not be enough to have confidence that the code
> +works correctly, e.g. for a hash function.
> +
> +In such cases, it can be helpful to have a helper macro or function, e.g. this
> +fictitious example for ``md5sum(1)``
> +
> +.. code-block:: c
> +
> + /* Note: the cast is to satisfy overly strict type-checking. */
> + #define TEST_MD5(in, want) \
> + md5sum(in, out); \
> + KUNIT_EXPECT_STREQ_MSG(test, (char *)out, want, "md5sum(%s)", in);
> +
> + char out[16];
> + TEST_MD5("hello world", "5eb63bbbe01eeed093cb22bb8f5acdc3");
> + TEST_MD5("hello world!", "fc3ff98e8c6a0d3087d515c0473f8677");
> +
> +Note the use of ``KUNIT_EXPECT_STREQ_MSG`` to give more context when it fails
> +and make it easier to track down. (Yes, in this example, ``want`` is likely
> +going to be unique enough on its own).
> +
> +The ``_MSG`` variants are even more useful when the same expectation is called
> +multiple times (in a loop or helper function) and thus the line number isn't
> +enough to identify what failed, like below.
> +
> +In some cases, it can be helpful to write a *table-driven test* instead, e.g.
> +
> +.. code-block:: c
> +
> + int i;
> + char out[16];
> +
> + struct md5_test_case {
> + const char *str;
> + const char *md5;
> + };
> +
> + struct md5_test_case cases[] = {
> + {
> + .str = "hello world",
> + .md5 = "5eb63bbbe01eeed093cb22bb8f5acdc3",
> + },
> + {
> + .str = "hello world!",
> + .md5 = "fc3ff98e8c6a0d3087d515c0473f8677",
> + },
> + };
> + for (i = 0; i < ARRAY_SIZE(cases); ++i) {
> + md5sum(cases[i].str, out);
> + KUNIT_EXPECT_STREQ_MSG(test, (char *)out, cases[i].md5,
> + "md5sum(%s)", cases[i].str);
> + }
> +
> +
> +There's more boilerplate involved, but it can:
> +
> +* be more readable when there are multiple inputs/outputs thanks to field names,
> +
> + * E.g. see ``fs/ext4/inode-test.c`` for an example of both.
> +* reduce duplication if test cases can be shared across multiple tests.
> +
> + * E.g. if we had a magical ``undo_md5sum`` function, we could reuse ``cases``.
> +

This is a bit of a nitpick, but I don't think this is quite conveying
the usefulness of table-based testing. Maybe it's that a hypothetical
"undo_md5sum" is too unrealistic an example? Maybe, instead of having
both the macro-based and table-driven examples based around md5sum(),
the table-based one could use something more obviously invertible /
reusable, and include both in the example code. E.g, something akin to
toupper() and tolower() or some other conversion function. I think
having a better example here is probably more useful than having both
the table- and macro- driven examples test the same thing.


> .. _kunit-on-non-uml:
>
> KUnit on non-UML architectures
>
> base-commit: 77c8473edf7f7664137f555cfcdc8c460bbd947d
> --
> 2.29.1.341.ge80a0c044ae-goog
>