Re: [PATCH v3 03/10] arm64, kfence: enable KFENCE for ARM64

From: Mark Rutland
Date: Thu Oct 08 2020 - 06:45:14 EST


On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> [...]
> > > > If you need virt_to_page() to work, the address has to be part of the
> > > > linear/direct map.
> [...]
> >
> > What's the underlying requirement here? Is this a performance concern,
> > codegen/codesize, or something else?
>
> It used to be performance, since is_kfence_address() is used in the
> fast path. However, with some further tweaks we just did to
> is_kfence_address(), our benchmarks show a pointer load can be
> tolerated.

Great!

I reckon that this is something we can optimize in futue if necessary
(e.g. with some form of code-patching for immediate values), but it's
good to have a starting point that works everywhere!

[...]

> > I'm not too worried about allocating this dynamically, but:
> >
> > * The arch code needs to set up the translation tables for this, as we
> > cannot safely change the mapping granularity live.
> >
> > * As above I'm fairly certain x86 needs to use a carevout from the
> > linear map to function correctly anyhow, so we should follow the same
> > approach for both arm64 and x86. That might be a static carevout that
> > we figure out the aliasing for, or something entirely dynamic.
>
> We're going with dynamically allocating the pool (for both x86 and
> arm64), since any benefits we used to measure from the static pool are
> no longer measurable (after removing a branch from
> is_kfence_address()). It should hopefully simplify a lot of things,
> given all the caveats that you pointed out.
>
> For arm64, the only thing left then is to fix up the case if the
> linear map is not forced to page granularity.

The simplest way to do this is to modify arm64's arch_add_memory() to
force the entire linear map to be mapped at page granularity when KFENCE
is enabled, something like:

| diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
| index 936c4762dadff..f6eba0642a4a3 100644
| --- a/arch/arm64/mm/mmu.c
| +++ b/arch/arm64/mm/mmu.c
| @@ -1454,7 +1454,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
| {
| int ret, flags = 0;
|
| - if (rodata_full || debug_pagealloc_enabled())
| + if (rodata_full || debug_pagealloc_enabled() ||
| + IS_ENABLED(CONFIG_KFENCE))
| flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
|
| __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),

... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I
suspect it's not worth trying to only for that for the KFENCE region
unless someone complains.

Thanks,
Mark.