Re: [PATCH] x86/mm/mem_encrypt: fix a crash with kmemleak_scan

From: Catalin Marinas
Date: Tue Apr 23 2019 - 10:03:03 EST


Hi Boris,

On Tue, Apr 23, 2019 at 03:25:18PM +0200, Borislav Petkov wrote:
> On Thu, Apr 18, 2019 at 10:50:15AM +0100, Catalin Marinas wrote:
> > Kmemleak is basically a tri-colour marking tracing garbage collector [1]
>
> Thanks for that - interesting read.
>
> > but without automatic freeing. It relies on a graph of references
> > (pointers) between various objects and the root of such graph is the
> > .bss/.data.

Sorry for the misleading information here, the root of the graph was
changed recently (see below).

> > free_init_pages() is called on memory regions outside .bss/.data like
> > smp_locks, initrd, kernel init text which kmemleak does not track
> > anyway. That said, kmemleak_free_part() is tolerant to unknown pointers,
> > so we could call it directly from free_init_pages().
>
> Ok, lemme think out loud for a bit here: kmemleak_scan() goes over
> an object list and I guess in our particular case, the memory which
> got freed in mem_encrypt_free_decrypted_mem() *was* in that list too,
> leading to the crash.

Yes.

> Looking at the splat, it is in scan_gray_list() which would mean that
> the object we freed was reachable from the root(s) in .bss.

The .bss/.data used to be root until recently when commit 298a32b13208
("kmemleak: powerpc: skip scanning holes in the .bss section") changed
this to accommodate a similar problem on powerpc. With this commit,
.bss/.data are traced objects but painted "grey" by default so that they
will be always scanned, pretty much like the root (and they can't
"leak").

In Qian's splat, the unmapped area was actually in the .bss which is now
a traced object (no longer a root one). In his previous report on
powerpc [1], the splat was in scan_large_block().

> Now, the docs say:
>
> "The memory allocations via :c:func:`kmalloc`, :c:func:`vmalloc`,
> :c:func:`kmem_cache_alloc` and
> friends are traced and the pointers, together with additional
> information like size and stack trace, are stored in a rbtree."
>
> So I guess free_init_pages() should be somehow telling kmemleak, "hey,
> just freed that object, pls adjust your tracking lists" no?
>
> Because, otherwise, if we start sprinkling those kmemleak_free_part()
> calls everywhere, that'll quickly turn into a game of whack-a-mole. And
> we don't need that especially if kmemleak can easily be taught to handle
> such cases.

Object freeing is tracked in general via the corresponding kfree(),
vfree() etc. and don't need special handling. The .bss doesn't have this
alloc/free symmetry and not freeing it all either, hence this
workaround to register it as a traced object and allow partial freeing.

Anyway, I agree with you. As I mentioned in the previous email,
kmemleak_free_part() is tolerant to unknown objects (not tracked by
kmemleak), so I'm fine with calling it from free_init_pages() even if
not all address ranges passed to this function are known to kmemleak.

> > There is Documentation/dev-tools/kmemleak.rst briefly mentioning the
> > tracing garbage collector (although the Wikipedia link is no longer
> > valid, it should be replaced with [1]). It doesn't, however, state why
> > .bss/.data are special.
>
> The fact that they're special is important info, I'd say.

I took a note to improve this when I get some time.

> > [1] https://en.wikipedia.org/wiki/Tracing_garbage_collection#Tri-color_marking
>
> is nice. While reading, it made me think how our discussion would go if
> we didn't have wikipedia. You'd probably say, go to the library and read
> this and that section in this and that book on tri-color marking. :-)

There are probably some academic papers published somewhere ;). But
wikipedia makes things much easier (and free).

--
Catalin

[1] http://lkml.kernel.org/r/20190312191412.28656-1-cai@xxxxxx