Re: [PATCH] kasan: allow sampling page_alloc allocations for HW_TAGS

From: Andrew Morton
Date: Thu Oct 27 2022 - 16:52:37 EST


On Thu, 27 Oct 2022 22:10:09 +0200 andrey.konovalov@xxxxxxxxx wrote:

> From: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
>
> Add a new boot parameter called kasan.page_alloc.sample, which makes
> Hardware Tag-Based KASAN tag only every Nth page_alloc allocation.
>
> As Hardware Tag-Based KASAN is intended to be used in production, its
> performance impact is crucial. As page_alloc allocations tend to be big,
> tagging and checking all such allocations introduces a significant
> slowdown in some testing scenarios. The new flag allows to alleviate
> that slowdown.
>
> Enabling page_alloc sampling has a downside: KASAN will miss bad accesses
> to a page_alloc allocation that has not been tagged.
>

The Documentation:

> --- a/Documentation/dev-tools/kasan.rst
> +++ b/Documentation/dev-tools/kasan.rst
> @@ -140,6 +140,10 @@ disabling KASAN altogether or controlling its features:
> - ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc
> allocations (default: ``on``).
>
> +- ``kasan.page_alloc.sample=<sampling frequency>`` makes KASAN tag only
> + every Nth page_alloc allocation, where N is the value of the parameter
> + (default: ``1``).
> +

explains what this does but not why it does it.

Let's tell people that this is here to mitigate the performance overhead.

And how is this performance impact observed? The kernel just gets
overall slower?

If someone gets a KASAN report using this mitigation, should their next
step be to set kasan.page_alloc.sample back to 1 and rerun, in order to
get a more accurate report before reporting it upstream? I'm thinking
"no"?

Finally, it would be helpful if the changelog were to give us some
sense of the magnitude of the impact with kasan.page_alloc.sample=1.
Does the kernel get 3x slower? 50x?