Re: [PATCH RFC 2/3] kfence: Optimize branches prediction when sample interval is zero

From: Marco Elver
Date: Mon Jan 24 2022 - 03:20:40 EST


On Mon, 24 Jan 2022 at 03:37, Peng Liu <liupeng256@xxxxxxxxxx> wrote:
>
> In order to release a uniform kernel with KFENCE, it is good to
> compile it with CONFIG_KFENCE_SAMPLE_INTERVAL = 0. For a group of
> produtions who don't want to use KFENCE, they can use kernel just
> as original vesion without KFENCE. For KFENCE users, they can open
> it by setting the kernel boot parameter kfence.sample_interval.
> Hence, set KFENCE sample interval default to zero is convenient.
>
> The current KFENCE is supportted to adjust sample interval via the
> kernel boot parameter. However, branches prediction in kfence_alloc
> is not good for situation with CONFIG_KFENCE_SAMPLE_INTERVAL = 0
> and boot parameter kfence.sample_interval != 0, which is because
> the current kfence_alloc is likely to return NULL when
> CONFIG_KFENCE_SAMPLE_INTERVAL = 0. To optimize branches prediction
> in this situation, kfence_enabled will check firstly.

This patch doesn't make any sense. You're adding an unconditional LOAD
to the fast path.

And the choice of static_branch_unlikely() if
CONFIG_KFENCE_SAMPLE_INTERVAL == 0 is very much deliberate, as it
generates code that is preferable in the common case (KFENCE is
disabled).

Please see include/linux/jump_label.h:430. But even then, CPUs are
very good at dealing with unconditional branches, so the difference
really is a wash.

But that new LOAD is not acceptable.

Sorry, but Nack.

> Signed-off-by: Peng Liu <liupeng256@xxxxxxxxxx>
> ---
> include/linux/kfence.h | 5 ++++-
> mm/kfence/core.c | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/kfence.h b/include/linux/kfence.h
> index aec4f6b247b5..bf91b76b87ee 100644
> --- a/include/linux/kfence.h
> +++ b/include/linux/kfence.h
> @@ -17,6 +17,7 @@
> #include <linux/atomic.h>
> #include <linux/static_key.h>
>
> +extern bool kfence_enabled;
> extern unsigned long kfence_num_objects;
> /*
> * We allocate an even number of pages, as it simplifies calculations to map
> @@ -115,7 +116,9 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags);
> */
> static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
> {
> -#if defined(CONFIG_KFENCE_STATIC_KEYS) || CONFIG_KFENCE_SAMPLE_INTERVAL == 0
> + if (!kfence_enabled)
> + return NULL;
> +#if defined(CONFIG_KFENCE_STATIC_KEYS)
> if (!static_branch_unlikely(&kfence_allocation_key))
> return NULL;
> #else
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 4655bcc0306e..2301923182b8 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -48,7 +48,7 @@
>
> /* === Data ================================================================= */
>
> -static bool kfence_enabled __read_mostly;
> +bool kfence_enabled __read_mostly;
>
> static unsigned long kfence_sample_interval __read_mostly = CONFIG_KFENCE_SAMPLE_INTERVAL;
>
> --
> 2.18.0.huawei.25
>