Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core)

From: Kees Cook
Date: Tue Jun 21 2016 - 14:12:38 EST


On Tue, Jun 21, 2016 at 10:27 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Tue, Jun 21, 2016 at 10:16 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Tue, Jun 21, 2016 at 9:45 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>
>>> So I'm leaning toward fewer cache entries per cpu, maybe just one.
>>> I'm all for making it a bit faster, but I think we should weigh that
>>> against increasing memory usage too much and thus scaring away the
>>> embedded folks.
>>
>> I don't think the embedded folks will be scared by a per-cpu cache, if
>> it's just one or two entries. And I really do think that even just
>> one or two entries will indeed catch a lot of the cases.
>>
>> And yes, fork+execve() is too damn expensive in page table build-up
>> and tear-down. I'm not sure why bash doesn't do vfork+exec for when it
>> has to wait for the process anyway, but it doesn't seem to do that.
>>
>
> I don't know about bash, but glibc very recently fixed a long-standing
> but in posix_spawn and started using clone() in a sensible manner for
> this.
>
> FWIW, it may be a while before this can be enabled in distro kernels.
> There are some code paths (*cough* crypto users *cough*) that think
> that calling sg_init_one with a stack address is a reasonable thing to
> do, and it doesn't work with a vmalloced stack. grsecurity works

... O_o ...

Why does it not work on a vmalloced stack??

> around this by using a real lowmem higher-order stack, aliasing it
> into vmalloc space, and arranging for virt_to_phys to backtrack the
> alias, but eww. I think I'd rather find and fix the bugs, assuming
> they're straightforward.

Yeah. That's ugly.

-Kees

--
Kees Cook
Chrome OS & Brillo Security