Re: [kernel-hardening] Re: [RFC PATCH 6/6] arm64: add VMAP_STACK and detect out-of-bounds SP

From: Robin Murphy
Date: Fri Jul 14 2017 - 11:04:07 EST


On 14/07/17 15:39, Robin Murphy wrote:
> On 14/07/17 15:06, Mark Rutland wrote:
>> On Fri, Jul 14, 2017 at 01:27:14PM +0100, Ard Biesheuvel wrote:
>>> On 14 July 2017 at 11:48, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote:
>>>> On 14 July 2017 at 11:32, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>>>>> On Thu, Jul 13, 2017 at 07:28:48PM +0100, Ard Biesheuvel wrote:
>>
>>>>>> OK, so here's a crazy idea: what if we
>>>>>> a) carve out a dedicated range in the VMALLOC area for stacks
>>>>>> b) for each stack, allocate a naturally aligned window of 2x the stack
>>>>>> size, and map the stack inside it, leaving the remaining space
>>>>>> unmapped
>>
>>>>> The logical ops (TST) and conditional branches (TB(N)Z, CB(N)Z) operate
>>>>> on XZR rather than SP, so to do this we need to get the SP value into a
>>>>> GPR.
>>>>>
>>>>> Previously, I assumed this meant we needed to corrupt a GPR (and hence
>>>>> stash that GPR in a sysreg), so I started writing code to free sysregs.
>>>>>
>>>>> However, I now realise I was being thick, since we can stash the GPR
>>>>> in the SP:
>>>>>
>>>>> sub sp, sp, x0 // sp = orig_sp - x0
>>>>> add x0, sp, x0 // x0 = x0 - (orig_sp - x0) == orig_sp
>>
>> That comment is off, and should say x0 = x0 + (orig_sp - x0) == orig_sp
>>
>>>>> sub x0, x0, #S_FRAME_SIZE
>>>>> tb(nz) x0, #THREAD_SHIFT, overflow
>>>>> add x0, x0, #S_FRAME_SIZE
>>>>> sub x0, sp, x0
>>>
>>> You need a neg x0, x0 here I think
>>
>> Oh, whoops. I'd mis-simplified things.
>>
>> We can avoid that by storing orig_sp + orig_x0 in sp:
>>
>> add sp, sp, x0 // sp = orig_sp + orig_x0
>> sub x0, sp, x0 // x0 = orig_sp
>> < check >
>> sub x0, sp, x0 // x0 = orig_x0
>
> Haven't you now forcibly cleared the top bit of x0 thanks to overflow?

...or maybe not. I still can't quite see it, but I suppose it must
cancel out somewhere, since Mr. Helpful C Program[1] has apparently
proven me mistaken :(

I guess that means I approve!

Robin.

[1]:
#include <assert.h>
#include <stdint.h>

int main(void) {
for (int i = 0; i < 256; i++) {
for (int j = 0; j < 256; j++) {
uint8_t x = i;
uint8_t y = j;
y = y + x;
x = y - x;
x = y - x;
y = y - x;
assert(x == i && y == j);
}
}
}

>> sub sp, sp, x0 // sp = orig_sp
>>
>> ... which works in a locally-built kernel where I've aligned all the
>> stacks.
>>
>>> ... only, this requires a dedicated stack region, and so we'd need to
>>> check whether sp is inside that window as well.
>>>
>>> The easieast way would be to use a window whose start address is base2
>>> aligned, but that means the beginning of the kernel VA range (where
>>> KASAN currently lives, and cannot be moved afaik), or a window at the
>>> top of the linear region. Neither look very appealing
>>>
>>> So that means arbitrary low and high limits to compare against in this
>>> entry path. That means more GPRs I'm afraid.
>>
>> Could you elaborate on that? I'm not sure that I follow.
>>
>> My understanding was that the comprimise with this approach is that we
>> only catch overflow/underflow within THREAD_SIZE of the stack, and can
>> get false-negatives elsewhere. Otherwise, IIUC this is sufficient
>>
>> Are you after a more stringent check (like those from the two existing
>> proposals that caught all out-of-bounds accesses)?
>>
>> Or am I missing something else?
>>
>> Thanks,
>> Mark.
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>