Re: [PATCH v1 00/11] mm/kasan: support per-page shadow memory to reduce memory consumption

From: Andrey Ryabinin
Date: Tue May 30 2017 - 10:15:10 EST


On 05/29/2017 06:29 PM, Dmitry Vyukov wrote:
> Joonsoo,
>
> I guess mine (and Andrey's) main concern is the amount of additional
> complexity (I am still struggling to understand how it all works) and
> more arch-dependent code in exchange for moderate memory win.
>
> Joonsoo, Andrey,
>
> I have an alternative proposal. It should be conceptually simpler and
> also less arch-dependent. But I don't know if I miss something
> important that will render it non working.
> Namely, we add a pointer to shadow to the page struct. Then, create a
> slab allocator for 512B shadow blocks. Then, attach/detach these
> shadow blocks to page structs as necessary. It should lead to even
> smaller memory consumption because we won't need a whole shadow page
> when only 1 out of 8 corresponding kernel pages are used (we will need
> just a single 512B block). I guess with some fragmentation we need
> lots of excessive shadow with the current proposed patch.
> This does not depend on TLB in any way and does not require hooking
> into buddy allocator.
> The main downside is that we will need to be careful to not assume
> that shadow is continuous. In particular this means that this mode
> will work only with outline instrumentation and will need some ifdefs.
> Also it will be slower due to the additional indirection when
> accessing shadow, but that's meant as "small but slow" mode as far as
> I understand.

It seems that you are forgetting about stack instrumentation.
You'll have to disable it completely, at least with current implementation of it in gcc.

> But the main win as I see it is that that's basically complete support
> for 32-bit arches. People do ask about arm32 support:
> https://groups.google.com/d/msg/kasan-dev/Sk6BsSPMRRc/Gqh4oD_wAAAJ
> https://groups.google.com/d/msg/kasan-dev/B22vOFp-QWg/EVJPbrsgAgAJ
> and probably mips32 is relevant as well.

I don't see how above is relevant for 32-bit arches. Current design
is perfectly fine for 32-bit arches. I did some POC arm32 port couple years
ago - https://github.com/aryabinin/linux/commits/kasan/arm_v0_1
It has some ugly hacks and non-critical bugs. AFAIR it also super-slow because I (mistakenly)
made shadow memory uncached. But otherwise it works.

> Such mode does not require a huge continuous address space range, has
> minimal memory consumption and requires minimal arch-dependent code.
> Works only with outline instrumentation, but I think that's a
> reasonable compromise.
>
> What do you think?

I don't understand why we trying to invent some hacky/complex schemes when we already have
a simple one - scaling shadow to 1/32. It's easy to implement and should be more performant comparing
to suggested schemes.