Re: bio linked list corruption.

From: Andy Lutomirski
Date: Thu Oct 20 2016 - 19:25:36 EST


On Thu, Oct 20, 2016 at 4:03 PM, Dave Jones <davej@xxxxxxxxxxxxxxxxx> wrote:
> On Thu, Oct 20, 2016 at 04:01:12PM -0700, Andy Lutomirski wrote:
> > On Thu, Oct 20, 2016 at 3:50 PM, Dave Jones <davej@xxxxxxxxxxxxxxxxx> wrote:
> > > On Tue, Oct 18, 2016 at 06:05:57PM -0700, Andy Lutomirski wrote:
> > >
> > > > One possible debugging approach would be to change:
> > > >
> > > > #define NR_CACHED_STACKS 2
> > > >
> > > > to
> > > >
> > > > #define NR_CACHED_STACKS 0
> > > >
> > > > in kernel/fork.c and to set CONFIG_DEBUG_PAGEALLOC=y. The latter will
> > > > force an immediate TLB flush after vfree.
> > >
> > > I can give that idea some runtime, but it sounds like this a case where
> > > we're trying to prove a negative, and that'll just run and run ? In which case I
> > > might do this when I'm travelling on Sunday.
> >
> > The idea is that the stack will be free and unmapped immediately upon
> > process exit if configured like this so that bogus stack accesses (by
> > the CPU, not DMA) would OOPS immediately.
>
> oh, misparsed. ok, I can definitely get behind that idea then.
> I'll do that next.
>

It could be worth trying this, too:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=174531fef4e8

It occurred to me that the current code is a little bit fragile.

--Andy