Re: [PATCH v2 5/5] arm64: entry: Enable random_kstack_offset support

From: Mark Rutland
Date: Mon Mar 30 2020 - 07:26:28 EST


On Thu, Mar 26, 2020 at 09:31:32AM -0700, Kees Cook wrote:
> On Thu, Mar 26, 2020 at 11:15:21AM +0000, Mark Rutland wrote:
> > On Wed, Mar 25, 2020 at 01:22:07PM -0700, Kees Cook wrote:
> > > On Wed, Mar 25, 2020 at 01:21:27PM +0000, Mark Rutland wrote:
> > > > On Tue, Mar 24, 2020 at 01:32:31PM -0700, Kees Cook wrote:
> > > > > Allow for a randomized stack offset on a per-syscall basis, with roughly
> > > > > 5 bits of entropy.
> > > > >
> > > > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > > >
> > > > Just to check, do you have an idea of the impact on arm64? Patch 3 had
> > > > figures for x86 where it reads the TSC, and it's unclear to me how
> > > > get_random_int() compares to that.
> > >
> > > I didn't do a measurement on arm64 since I don't have a good bare-metal
> > > test environment. I know Andy Lutomirki has plans for making
> > > get_random_get() as fast as possible, so that's why I used it here.
> >
> > Ok. I suspect I also won't get the chance to test that in the next few
> > days, but if I do I'll try to share the results.
>
> Okay, thanks! I can try a rough estimate under emulation, but I assume
> that'll be mostly useless. :)
>
> > My concern here was that, get_random_int() has to grab a spinlock and
> > mess with IRQ masking, so has the potential to block for much longer,
> > but that might not be an issue in practice, and I don't think that
> > should block these patches.
>
> Gotcha. I was already surprised by how "heavy" the per-cpu access was
> when I looked at the resulting assembly (there looked to be preempt
> stuff, etc). But my hope was that this is configurable so people can
> measure for themselves if they want it, and most people who want this
> feature have a high tolerance for performance trade-offs. ;)
>
> > > I couldn't figure out if there was a comparable instruction like rdtsc
> > > in aarch64 (it seems there's a cycle counter, but I found nothing in
> > > the kernel that seemed to actually use it)?
> >
> > AArch64 doesn't have a direct equivalent. The generic counter
> > (CNTxCT_EL0) is the closest thing, but its nominal frequency is
> > typically much lower than the nominal CPU clock frequency (unlike TSC
> > where they're the same). The cycle counter (PMCCNTR_EL0) is part of the
> > PMU, and can't be relied on in the same way (e.g. as perf reprograms it
> > to generate overflow events, and it can stop for things like WFI/WFE).
>
> Okay, cool; thanks for the details! It's always nice to confirm I didn't
> miss some glaringly obvious solution. ;)
>
> For a potential v2, should I add your reviewed-by or wait for your
> timing analysis, etc?

I'd rather not give an R-b until I've seen numbers, but please don't
block waiting for that. For the moment, feel free to add:

Acked-by: Mark Rutland <mark.rutland@xxxxxxx>

... and it's down to Will and Catalin to make the call for arm64.

Thanks,
Mark.