Re: bio linked list corruption.

From: Chris Mason
Date: Fri Oct 21 2016 - 16:18:35 EST




On 10/21/2016 04:02 PM, Dave Jones wrote:
On Thu, Oct 20, 2016 at 04:23:32PM -0700, Andy Lutomirski wrote:
> On Thu, Oct 20, 2016 at 4:03 PM, Dave Jones <davej@xxxxxxxxxxxxxxxxx> wrote:
> > On Thu, Oct 20, 2016 at 04:01:12PM -0700, Andy Lutomirski wrote:
> > > On Thu, Oct 20, 2016 at 3:50 PM, Dave Jones <davej@xxxxxxxxxxxxxxxxx> wrote:
> > > > On Tue, Oct 18, 2016 at 06:05:57PM -0700, Andy Lutomirski wrote:
> > > >
> > > > > One possible debugging approach would be to change:
> > > > >
> > > > > #define NR_CACHED_STACKS 2
> > > > >
> > > > > to
> > > > >
> > > > > #define NR_CACHED_STACKS 0
> > > > >
> > > > > in kernel/fork.c and to set CONFIG_DEBUG_PAGEALLOC=y. The latter will
> > > > > force an immediate TLB flush after vfree.
> > > >
> > > > I can give that idea some runtime, but it sounds like this a case where
> > > > we're trying to prove a negative, and that'll just run and run ? In which case I
> > > > might do this when I'm travelling on Sunday.
> > >
> > > The idea is that the stack will be free and unmapped immediately upon
> > > process exit if configured like this so that bogus stack accesses (by
> > > the CPU, not DMA) would OOPS immediately.
> >
> > oh, misparsed. ok, I can definitely get behind that idea then.
> > I'll do that next.
> >
>
> It could be worth trying this, too:
>
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=174531fef4e8
>
> It occurred to me that the current code is a little bit fragile.

It's been nearly 24hrs with the above changes, and it's been pretty much
silent the whole time.

The only thing of note over that time period has been a btrfs lockdep
warning that's been around for a while, and occasional btrfs checksum
failures, which I've been seeing for a while, but seem to have gotten
worse since 4.8.

Meaning you hit them with v4.8 or not?


I'm pretty confident in the disk being ok in this machine, so I think
the checksum warnings are bogus. Chris suggested they may be the result
of memory corruption, but there's little else going on.


BTRFS warning (device sda3): csum failed ino 130654 off 0 csum 2566472073 expected csum 3008371513
BTRFS warning (device sda3): csum failed ino 131057 off 4096 csum 3563910319 expected csum 738595262
BTRFS warning (device sda3): csum failed ino 131176 off 4096 csum 1344477721 expected csum 441864825
BTRFS warning (device sda3): csum failed ino 131241 off 245760 csum 3576232181 expected csum 2566472073
BTRFS warning (device sda3): csum failed ino 131429 off 0 csum 1494450239 expected csum 2646577722
BTRFS warning (device sda3): csum failed ino 131471 off 0 csum 3949539320 expected csum 3828807800
BTRFS warning (device sda3): csum failed ino 131471 off 4096 csum 3475108475 expected csum 2566472073
BTRFS warning (device sda3): csum failed ino 131471 off 958464 csum 142982740 expected csum 2566472073
BTRFS warning (device sda3): csum failed ino 131471 off 0 csum 3949539320 expected csum 3828807800
BTRFS warning (device sda3): csum failed ino 131532 off 270336 csum 3138898528 expected csum 2566472073
BTRFS warning (device sda3): csum failed ino 131532 off 1249280 csum 2169165042 expected csum 2566472073
BTRFS warning (device sda3): csum failed ino 131649 off 16384 csum 2914965650 expected csum 1425742005


A curious thing: the expected csum 2566472073 turns up a number of times for different inodes, and gets
differing actual csums each time. I suppose this could be something like a block of all zeros in multiple files,
but it struck me as surprising.

btrfs people: is there an easy way to map those inodes to a filename ? I'm betting those are the
test files that trinity generates. If so, it might point to a race somewhere.

btrfs inspect inode 130654 mntpoint

-chris