Re: [PATCH RFC 01/10] mm: add Kernel Electric-Fence infrastructure

From: Marco Elver
Date: Tue Sep 15 2020 - 10:28:38 EST


On Tue, Sep 15, 2020 at 03:57PM +0200, SeongJae Park wrote:
[...]
>
> So interesting feature! I left some tirvial comments below.

Thank you!

> [...]
> > diff --git a/lib/Kconfig.kfence b/lib/Kconfig.kfence
> > new file mode 100644
> > index 000000000000..7ac91162edb0
> > --- /dev/null
> > +++ b/lib/Kconfig.kfence
> > @@ -0,0 +1,58 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +config HAVE_ARCH_KFENCE
> > + bool
> > +
> > +config HAVE_ARCH_KFENCE_STATIC_POOL
> > + bool
> > + help
> > + If the architecture supports using the static pool.
> > +
> > +menuconfig KFENCE
> > + bool "KFENCE: low-overhead sampling-based memory safety error detector"
> > + depends on HAVE_ARCH_KFENCE && !KASAN && (SLAB || SLUB)
> > + depends on JUMP_LABEL # To ensure performance, require jump labels
> > + select STACKTRACE
> > + help
> > + KFENCE is low-overhead sampling-based detector for heap out-of-bounds
> > + access, use-after-free, and invalid-free errors. KFENCE is designed
> > + to have negligible cost to permit enabling it in production
> > + environments.
> > +
> > + See <file:Documentation/dev-tools/kfence.rst> for more details.
>
> This patch doesn't provide the file yet. Why don't you add the reference with
> the patch introducing the file?

Sure, will fix for v3.

> > +
> > + Note that, KFENCE is not a substitute for explicit testing with tools
> > + such as KASAN. KFENCE can detect a subset of bugs that KASAN can
> > + detect (therefore enabling KFENCE together with KASAN does not make
> > + sense), albeit at very different performance profiles.
> [...]
> > diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> > new file mode 100644
> > index 000000000000..e638d1f64a32
> > --- /dev/null
> > +++ b/mm/kfence/core.c
> > @@ -0,0 +1,730 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#define pr_fmt(fmt) "kfence: " fmt
> [...]
> > +
> > +static inline struct kfence_metadata *addr_to_metadata(unsigned long addr)
> > +{
> > + long index;
> > +
> > + /* The checks do not affect performance; only called from slow-paths. */
> > +
> > + if (!is_kfence_address((void *)addr))
> > + return NULL;
> > +
> > + /*
> > + * May be an invalid index if called with an address at the edge of
> > + * __kfence_pool, in which case we would report an "invalid access"
> > + * error.
> > + */
> > + index = ((addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2)) - 1;
>
> Seems the outermost parentheses unnecessary.

Will fix.

> > + if (index < 0 || index >= CONFIG_KFENCE_NUM_OBJECTS)
> > + return NULL;
> > +
> > + return &kfence_metadata[index];
> > +}
> > +
> > +static inline unsigned long metadata_to_pageaddr(const struct kfence_metadata *meta)
> > +{
> > + unsigned long offset = ((meta - kfence_metadata) + 1) * PAGE_SIZE * 2;
>
> Seems the innermost parentheses unnecessary.

Will fix.

> > + unsigned long pageaddr = (unsigned long)&__kfence_pool[offset];
> > +
> > + /* The checks do not affect performance; only called from slow-paths. */
> > +
> > + /* Only call with a pointer into kfence_metadata. */
> > + if (KFENCE_WARN_ON(meta < kfence_metadata ||
> > + meta >= kfence_metadata + ARRAY_SIZE(kfence_metadata)))
>
> Is there a reason to use ARRAY_SIZE(kfence_metadata) instead of
> CONFIG_KFENCE_NUM_OBJECTS?

They're equivalent. We can switch it. (Although I don't see one being
superior to the other.. maybe we save on compile-time?)

> > + return 0;
> > +
> > + /*
> > + * This metadata object only ever maps to 1 page; verify the calculation
> > + * happens and that the stored address was not corrupted.
> > + */
> > + if (KFENCE_WARN_ON(ALIGN_DOWN(meta->addr, PAGE_SIZE) != pageaddr))
> > + return 0;
> > +
> > + return pageaddr;
> > +}
> [...]
> > +void __init kfence_init(void)
> > +{
> > + /* Setting kfence_sample_interval to 0 on boot disables KFENCE. */
> > + if (!kfence_sample_interval)
> > + return;
> > +
> > + if (!kfence_initialize_pool()) {
> > + pr_err("%s failed\n", __func__);
> > + return;
> > + }
> > +
> > + schedule_delayed_work(&kfence_timer, 0);
> > + WRITE_ONCE(kfence_enabled, true);
> > + pr_info("initialized - using %zu bytes for %d objects", KFENCE_POOL_SIZE,
> > + CONFIG_KFENCE_NUM_OBJECTS);
> > + if (IS_ENABLED(CONFIG_DEBUG_KERNEL))
> > + pr_cont(" at 0x%px-0x%px\n", (void *)__kfence_pool,
> > + (void *)(__kfence_pool + KFENCE_POOL_SIZE));
>
> Why don't you use PTR_FMT that defined in 'kfence.h'?

It's unnecessary, since all this is conditional on
IS_ENABLED(CONFIG_DEBUG_KERNEL)) and we can just avoid the indirection
through PTR_FMT.

> > + else
> > + pr_cont("\n");
> > +}
> [...]
> > diff --git a/mm/kfence/kfence.h b/mm/kfence/kfence.h
> > new file mode 100644
> > index 000000000000..25ce2c0dc092
> > --- /dev/null
> > +++ b/mm/kfence/kfence.h
> > @@ -0,0 +1,104 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef MM_KFENCE_KFENCE_H
> > +#define MM_KFENCE_KFENCE_H
> > +
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/types.h>
> > +
> > +#include "../slab.h" /* for struct kmem_cache */
> > +
> > +/* For non-debug builds, avoid leaking kernel pointers into dmesg. */
> > +#ifdef CONFIG_DEBUG_KERNEL
> > +#define PTR_FMT "%px"
> > +#else
> > +#define PTR_FMT "%p"
> > +#endif
> > +
> > +/*
> > + * Get the canary byte pattern for @addr. Use a pattern that varies based on the
> > + * lower 3 bits of the address, to detect memory corruptions with higher
> > + * probability, where similar constants are used.
> > + */
> > +#define KFENCE_CANARY_PATTERN(addr) ((u8)0xaa ^ (u8)((unsigned long)addr & 0x7))
> > +
> > +/* Maximum stack depth for reports. */
> > +#define KFENCE_STACK_DEPTH 64
> > +
> > +/* KFENCE object states. */
> > +enum kfence_object_state {
> > + KFENCE_OBJECT_UNUSED, /* Object is unused. */
> > + KFENCE_OBJECT_ALLOCATED, /* Object is currently allocated. */
> > + KFENCE_OBJECT_FREED, /* Object was allocated, and then freed. */
>
> Aligning the comments would look better (same to below comments).

Will fix.

> > +};
> [...]
> > diff --git a/mm/kfence/report.c b/mm/kfence/report.c
> > new file mode 100644
> > index 000000000000..8c28200e7433
> > --- /dev/null
> > +++ b/mm/kfence/report.c
> > @@ -0,0 +1,201 @@
> > +// SPDX-License-Identifier: GPL-2.0
> [...]
> > +/* Get the number of stack entries to skip get out of MM internals. */
> > +static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries,
> > + enum kfence_error_type type)
> > +{
> > + char buf[64];
> > + int skipnr, fallback = 0;
> > +
> > + for (skipnr = 0; skipnr < num_entries; skipnr++) {
> > + int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
> > +
> > + /* Depending on error type, find different stack entries. */
> > + switch (type) {
> > + case KFENCE_ERROR_UAF:
> > + case KFENCE_ERROR_OOB:
> > + case KFENCE_ERROR_INVALID:
> > + if (!strncmp(buf, KFENCE_SKIP_ARCH_FAULT_HANDLER, len))
>
> Seems KFENCE_SKIP_ARCH_FAULT_HANDLER not defined yet?

Correct, it'll be defined in <asm/kfence.h> in the x86 and arm64
patches. Leaving this is fine, since no architecture has selected
HAVE_ARCH_KFENCE in this patch yet; as a result, we also can't break the
build even if this is undefined.

Thanks,
-- Marco