Re: [4.1-rc7] btrfs related VM_BUG_ON in filemap.c

From: Dave Jones
Date: Wed Jun 17 2015 - 09:36:01 EST


On Tue, Jun 16, 2015 at 01:19:20PM -0400, Chris Mason wrote:
> On 06/16/2015 01:14 PM, David Sterba wrote:
> > On Wed, Jun 10, 2015 at 01:43:31PM -0400, Chris Mason wrote:
> >> On 06/10/2015 09:40 AM, Dave Jones wrote:
> >>> Found this on serial console this morning. The machine had rebooted itself shortly
> >>> afterwards (surprising, given I don't have panic-on-oops or similar set).
> >>
> >> We had one other report of this a few months ago. Josef and I read
> >> through all of this and decided it was impossible, so someone else must
> >> be holding on to that page and unlocking it.
> >>
> >> (that someone else could easily be btrfs, just not in this code path)
> >
> > https://patchwork.kernel.org/patch/6478941/ looks like the fix, bug
> > symptoms match the "keywords", I haven't inspected it closely.
> >
>
> That one is in my integration-4.2 branch if you want to give it a shot.

I was sceptical about this being the same bug, and it looks like I was right..

page:ffffea00027cc640 count:4 mapcount:0 mapping:ffff8800af11d8a0 index:0x0
flags: 0x4000000000000846(error|referenced|active|private)
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:745!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 1 PID: 5931 Comm: trinity-c5 Not tainted 4.1.0-rc8-gelk-debug+ #2
task: ffff8800b9ec0000 ti: ffff8800843ec000 task.ti: ffff8800843ec000
RIP: 0010:[<ffffffffb216ee5c>] [<ffffffffb216ee5c>] unlock_page+0x7c/0x80
RSP: 0018:ffff8800843efa58 EFLAGS: 00010292
RAX: 0000000000000036 RBX: 0000000000001000 RCX: 0000000000000000
RDX: 0000000080000000 RSI: ffffffffb20c80c9 RDI: ffffffffb20c7ce4
RBP: ffff8800843efa58 R08: 0000000000000001 R09: 0000000000000d1d
R10: 000000000000037c R11: 0000000000000001 R12: ffffea00027cc640
R13: 0000000000000000 R14: 0000000000000fff R15: 0000000000000000
FS: 00007fc9c42b5700(0000) GS:ffff8800bf700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000050978000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
ffff8800843efb68 ffffffffc02d06ec 0000000000000fff 0000100800000008
ffff8800af11d548 0000000000000000 ffff8800843efab8 0000000000000fff
0000000000000000 ffff88009f319000 ffff8800843efc08 ffff8800af11d728
Call Trace:
[<ffffffffc02d06ec>] __do_readpage+0x61c/0x7c0 [btrfs]
[<ffffffffc02cd973>] ? lock_extent_bits+0x83/0x2e0 [btrfs]
[<ffffffffb20a5001>] ? get_parent_ip+0x11/0x50
[<ffffffffc02b3ca0>] ? btrfs_real_readdir+0x5e0/0x5e0 [btrfs]
[<ffffffffc02ca41a>] ? btrfs_lookup_ordered_extent+0x9a/0xd0 [btrfs]
[<ffffffffc02d0955>] __extent_read_full_page+0xc5/0xe0 [btrfs]
[<ffffffffc02b3ca0>] ? btrfs_real_readdir+0x5e0/0x5e0 [btrfs]
[<ffffffffc02d18b7>] extent_read_full_page+0x37/0x60 [btrfs]
[<ffffffffc02b0c25>] btrfs_readpage+0x25/0x30 [btrfs]
[<ffffffffc02c0e7a>] prepare_uptodate_page+0x4a/0x90 [btrfs]
[<ffffffffc02c0fc1>] prepare_pages+0x101/0x190 [btrfs]
[<ffffffffc02c1b03>] __btrfs_buffered_write+0x1d3/0x650 [btrfs]
[<ffffffffc02c5713>] btrfs_file_write_iter+0x463/0x570 [btrfs]
[<ffffffffb2045eea>] ? bad_area+0x4a/0x60
[<ffffffffb21d05d1>] __vfs_write+0xb1/0xf0
[<ffffffffb21d0c59>] vfs_write+0xa9/0x1b0
[<ffffffffb21d1bd2>] SyS_pwrite64+0x72/0xb0
[<ffffffffb20125d0>] ? syscall_trace_enter_phase2+0x220/0x260
[<ffffffffb2012715>] ? syscall_trace_leave+0x95/0x140
[<ffffffffb26d5b77>] tracesys_phase2+0x84/0x89
Code: 10 48 d3 ee 48 8d 0c b6 48 89 c6 48 8d 3c ca 31 d2 e8 29 ca f4 ff 5d c3 0f 1f 80 00 00 00 00 48 c7 c6 c0 ed a2 b2 e8 f4 84 02 00 <0f> 0b 66 90 66 66 66 66 90 55 85 f6 48 89 e5 75 13 85 d2 74 3f
RIP [<ffffffffb216ee5c>] unlock_page+0x7c/0x80



Still haven't managed to narrow down a reproducer, but it shows up
consistently within 6 hrs or so of fuzzing.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/