Re: [PATCH v3 03/10] arm64, kfence: enable KFENCE for ARM64

From: Marco Elver
Date: Wed Oct 14 2020 - 15:12:58 EST


On Thu, 8 Oct 2020 at 12:45, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > [...]
> > > > > If you need virt_to_page() to work, the address has to be part of the
> > > > > linear/direct map.
> > [...]
> > >
> > > What's the underlying requirement here? Is this a performance concern,
> > > codegen/codesize, or something else?
> >
> > It used to be performance, since is_kfence_address() is used in the
> > fast path. However, with some further tweaks we just did to
> > is_kfence_address(), our benchmarks show a pointer load can be
> > tolerated.
>
> Great!
>
> I reckon that this is something we can optimize in futue if necessary
> (e.g. with some form of code-patching for immediate values), but it's
> good to have a starting point that works everywhere!
>
> [...]
>
> > > I'm not too worried about allocating this dynamically, but:
> > >
> > > * The arch code needs to set up the translation tables for this, as we
> > > cannot safely change the mapping granularity live.
> > >
> > > * As above I'm fairly certain x86 needs to use a carevout from the
> > > linear map to function correctly anyhow, so we should follow the same
> > > approach for both arm64 and x86. That might be a static carevout that
> > > we figure out the aliasing for, or something entirely dynamic.
> >
> > We're going with dynamically allocating the pool (for both x86 and
> > arm64), since any benefits we used to measure from the static pool are
> > no longer measurable (after removing a branch from
> > is_kfence_address()). It should hopefully simplify a lot of things,
> > given all the caveats that you pointed out.
> >
> > For arm64, the only thing left then is to fix up the case if the
> > linear map is not forced to page granularity.
>
> The simplest way to do this is to modify arm64's arch_add_memory() to
> force the entire linear map to be mapped at page granularity when KFENCE
> is enabled, something like:
>
[...]
>
> ... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I
> suspect it's not worth trying to only for that for the KFENCE region
> unless someone complains.

We've got most of this sorted now for v5 -- thank you!

The only thing we're wondering now, is if there are any corner cases
with using memblock_alloc'd memory for the KFENCE pool? (We'd like to
avoid page alloc's MAX_ORDER limit.) We have a version that passes
tests on x86 and arm64, but checking just in case. :-)

Thanks,
-- Marco