Re: [PATCH 2/4] kcov: add interrupt handling self test

From: Marco Elver
Date: Wed Jun 05 2024 - 05:10:28 EST


On Tue, 4 Jun 2024 at 15:45, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> Add a boot self test that can catch sprious coverage from interrupts.
> The coverage callback filters out interrupt code, but only after the
> handler updates preempt count. Some code periodically leaks out
> of that section and leads to spurious coverage.
> Add a best-effort (but simple) test that is likely to catch such bugs.
> If the test is enabled on CI systems that use KCOV, they should catch
> any issues fast.
>
> Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: syzkaller@xxxxxxxxxxxxxxxx
>
> ---
>
> In my local testing w/o the previous fix,
> it immidiatly produced the following splat:
>
> kcov: running selftest
> BUG: TASK stack guard page was hit at ffffc90000147ff8
> Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN PTI
> ...
> kvm_set_cpu_l1tf_flush_l1d+0x5/0x20
> sysvec_call_function+0x15/0xb0
> asm_sysvec_call_function+0x1a/0x20
> kcov_init+0xe4/0x130
> do_one_initcall+0xbc/0x470
> kernel_init_freeable+0x4fc/0x930
> kernel_init+0x1c/0x2b0
> ---
> kernel/kcov.c | 28 ++++++++++++++++++++++++++++
> lib/Kconfig.debug | 9 +++++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..04136f80042f 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -1057,6 +1057,30 @@ u64 kcov_common_handle(void)
> }
> EXPORT_SYMBOL(kcov_common_handle);
>
> +#ifdef CONFIG_KCOV_TEST
> +static void __init selftest(void)
> +{
> + volatile int i;
> +
> + pr_err("running self test\n");
> + /*
> + * Test that interrupts don't produce spurious coverage.
> + * The coverage callback filters out interrupt code, but only
> + * after the handler updates preempt count. Some code periodically
> + * leaks out of that section and leads to spurious coverage.
> + * It's hard to call the actual interrupt handler directly,
> + * so we just loop here for ~400 ms waiting for a timer interrupt.

Where do the 400 ms come from? I only see that it loops a long time,
but that the timing is entirely dependent on how fast the CPU executes
the loop.

> + * We set kcov_mode to enable tracing, but don't setup the area,
> + * so any attempt to trace will crash.
> + */
> + current->kcov_mode = KCOV_MODE_TRACE_PC;
> + for (i = 0; i < (1 << 28); i++)
> + ;

Can't you check jiffies, and e.g. check that actual ~100-500ms have elapsed?

timeout = jiffies + msecs_to_jiffies(300);
while (!time_after(jiffies, timeout)) {
cpu_relax();
}

> + current->kcov_mode = 0;
> + pr_err("done running self test\n");
> +}
> +#endif
> +
> static int __init kcov_init(void)
> {
> int cpu;
> @@ -1076,6 +1100,10 @@ static int __init kcov_init(void)
> */
> debugfs_create_file_unsafe("kcov", 0600, NULL, NULL, &kcov_fops);
>
> +#ifdef CONFIG_KCOV_TEST
> + selftest();
> +#endif
> +
> return 0;
> }
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 59b6765d86b8..79836a15b6cb 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2171,6 +2171,15 @@ config KCOV_IRQ_AREA_SIZE
> soft interrupts. This specifies the size of those areas in the
> number of unsigned long words.
>
> +config KCOV_TEST

s/TEST/SELFTEST/

It may be confused with a longer standalone test.

> + bool "Test CONFIG_KCOV feature"

Maybe "Perform short selftests on boot" (similar to CONFIG_KCSAN_SELFTEST).

> + depends on KCOV
> + help
> + Sanity check for KCOV coverage collection.
> + Runs built-in self test on boot to detect some common issues.
> +
> + If unsure, say N.
> +
> menuconfig RUNTIME_TESTING_MENU
> bool "Runtime Testing"
> default y
> --
> 2.45.1.467.gbab1589fc0-goog
>