Re: Making KASAN compatible with VMAP_STACK

From: Dmitry Vyukov
Date: Mon Jul 23 2018 - 07:56:13 EST


On Mon, Jul 23, 2018 at 1:18 PM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> On Mon, Jul 23, 2018 at 09:40:36AM +0200, Dmitry Vyukov wrote:
>> On Sun, Jul 22, 2018 at 7:52 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>> > Hi all-
>> >
>> > It would be really nice to make KASAN compatible with VMAP_STACK.
>> > Both are valuable memory debugging features, and the fact that you
>> > can't use both is disappointing.
>> >
>> > As far as I know, there are only two problems:
>> >
>> > 1. The KASAN shadow population code is a mess, and adding *anything*
>> > to the KASAN shadow requires magical, fragile incantations. It should
>> > be cleaned up so that ranges can be easily populated without needing
>> > to very carefully align things, call helpers in the right order, etc.
>> > The core KASAN code should figure it out by itself.
>> >
>> > 2. The vmalloc area is potentially extremely large. It might be
>> > necessary to have a way to *depopulate* shadow space when stacks get
>> > freed or, more generally, when vmap areas are freed. Ideally KASAN
>> > would integrate with the core vmalloc/vmap code and it would Just Work
>> > (tm). And, as a bonus, we'd get proper KASAN protection of vmalloced
>> > memory.
>> >
>> > Any volunteers to fix this?
>>
>> Hi Andy,
>>
>> I understand that having most configs as orthogonal settings that can
>> be enabled independently is generally good in intself, but I would
>> like to understand what does VMAP_STACK add on top of KASAN in terms
>> of debugging capabilities?
>
> VMAP_STACK makes it possible to detect stack overflows reliably at the
> point of overflow.
>
> KASAN can't handle this reliably, even if it detects that an access is
> out of the stack bounds, since handling this requires stack space.
> Depending on a number of factors, this may be reported, might result in
> recursive exceptions, etc.

Interesting. Does VMAP_STACK detect task_struct smashing today? As far
as I remember, the first version didn't.
As an orthogonal measure we could add KASAN redzone between stack and
task_struct, and make KASAN instrumentation detect when the new frame
hits this redzone. We bump stack order under KASAN significantly, so
adding, say 128 byte redzone should not be a problem. Does it make any
sense?