Re: [git pull] vfs pile 1 (splice)

From: Christoph Lameter
Date: Mon Oct 10 2016 - 10:03:33 EST


On Sun, 9 Oct 2016, Linus Torvalds wrote:

> Hmm. When I enabled SLUB debugging, I also enabled DEBUG_PAGEALLOC,
> because "why not". But it turns out that that may have been a mistake,
> because it changes the very path that failed to no longer do that
> failing access (or rather, it does it as a "probe_kernel_read()",
> which traps and ignores the failure).

DEBUG_PAGEALLOC significantly changes the layout of objects and thus this
may no longer trigger.

> I'll continue with *just* SLUB debugging on, but I thought it was
> interesting how enabling more memory access debugging actually ends up
> changing some really subtle code.

Debugging options to memory allocation functions can change the memory
layout which may cause the corruption to no longer happen or no longer
happen the same way. Surely wish there would be another way.

> Christoph, the problem is that something is triggering an oops or page
> fault (depending on how bogus the address is) in __kmalloc() when it
> does that get_freepointer_safe() thing without DEBUG_PAGEALLOC. I've
> seen two different cases on two different boots, but they both were on
> that one instruction that did that

Hmm.. Then get_freepointer_safe may not be ok. Should not trigger any
faults.

> Could be elsewhere too. I saw it twice in one day which would *tend*
> to mean that it's recent, but maybe I was just lucky the previous days
> and didn't hit it. I haven't been able to repro it now, but maybe I
> figured out one reason why my reproductions have been failing ;)

Ok reading the rest of the thread it seems that we found the issue but
still this get_freepointer_safe failure is not good. Do you have some more
debugging output that can shed some more light on the failure of
get_freepointer_safe?