Re: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase
From: Sasha Levin
Date: Mon Nov 26 2012 - 20:16:36 EST
On 11/12/2012 03:54 PM, Sasha Levin wrote:
> On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
>> Using the trinity fuzzer, Sasha Levin uncovered a case where
>> rb_subtree_gap wasn't correctly updated.
>>
>> Digging into this, the root cause was that vma insertions and removals
>> require both an rbtree insert or erase operation (which may trigger
>> tree rotations), and an update of the next vma's gap (which does not
>> change the tree topology, but may require iterating on the node's
>> ancestors to propagate the update). The rbtree rotations caused the
>> rb_subtree_gap values to be updated in some of the internal nodes, but
>> without upstream propagation. Then the subsequent update on the next
>> vma didn't iterate as high up the tree as it should have, as it
>> stopped as soon as it hit one of the internal nodes that had been
>> updated as part of a tree rotation.
>>
>> The fix is to impose that all rb_subtree_gap values must be up to date
>> before any rbtree insertion or erase, with the possible exception that
>> the node being erased doesn't need to have an up to date rb_subtree_gap.
>>
>> These 3 patches apply on top of the stack I previously sent (or equally,
>> on top of the last published mmotm).
>>
>> Michel Lespinasse (3):
>> mm: ensure safe rb_subtree_gap update when inserting new VMA
>> mm: ensure safe rb_subtree_gap update when removing VMA
>> mm: debug code to verify rb_subtree_gap updates are safe
>>
>> mm/mmap.c | 121 +++++++++++++++++++++++++++++++++++++------------------------
>> 1 files changed, 73 insertions(+), 48 deletions(-)
>>
>
> Looking good: old warnings gone, no new warnings.
I've built today's -next, and got the following BUG pretty quickly (2-3 hours):
[ 1556.479284] BUG: unable to handle kernel paging request at 0000000000412000
[ 1556.480036] IP: [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.480036] PGD 31739067 PUD 4fbc4067 PMD 1c936067 PTE 0
[ 1556.480036] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1556.480036] Dumping ftrace buffer:
[ 1556.480036] (ftrace buffer empty)
[ 1556.480036] CPU 0
[ 1556.480036] Pid: 10274, comm: trinity-child29 Tainted: G W 3.7.0-rc6-next-20121126-sasha-00015-gb04382b-dirty #201
[ 1556.480036] RIP: 0010:[<ffffffff81238184>] [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.480036] RSP: 0018:ffff88004fbc7d08 EFLAGS: 00010206
[ 1556.480036] RAX: 0000000000412000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1556.512120] RDX: 0000000000000000 RSI: ffff88001c1a6008 RDI: ffff88001c1a6000
[ 1556.512120] RBP: ffff88004fbc7d38 R08: ffff8800371e7808 R09: ffff88004fb56cf0
[ 1556.512120] R10: 0000000000000001 R11: 0000000000001000 R12: ffff88001c1a6000
[ 1556.512120] R13: ffff8800371e7b00 R14: 0000000000000000 R15: ffff88001c1a6000
[ 1556.512120] FS: 00007f4e0f8e3700(0000) GS:ffff8800bfc00000(0000) knlGS:0000000000000000
[ 1556.512120] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1556.512120] CR2: 0000000000412000 CR3: 000000002faec000 CR4: 00000000000406f0
[ 1556.512120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1556.512120] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1556.512120] Process trinity-child29 (pid: 10274, threadinfo ffff88004fbc6000, task ffff88004fbb0000)
[ 1556.512120] Stack:
[ 1556.512120] ffff8800bf80aa80 ffff88001c1a6000 ffff88004fb56cf0 ffff8800371e7818
[ 1556.512120] ffff8800371e7808 ffff88001c1a6000 ffff88004fbc7d88 ffffffff8123843c
[ 1556.512120] 0000000000000001 ffff88004fb56da8 ffff880000000000 ffff8800371e7818
[ 1556.512120] Call Trace:
[ 1556.512120] [<ffffffff8123843c>] vma_link+0xcc/0xf0
[ 1556.512120] [<ffffffff8123a8ac>] mmap_region+0x40c/0x5a0
[ 1556.512120] [<ffffffff8123aceb>] do_mmap_pgoff+0x2ab/0x310
[ 1556.512120] [<ffffffff8122477c>] ? vm_mmap_pgoff+0x6c/0xb0
[ 1556.512120] [<ffffffff81224794>] vm_mmap_pgoff+0x84/0xb0
[ 1556.512120] [<ffffffff81239483>] sys_mmap_pgoff+0x193/0x1a0
[ 1556.512120] [<ffffffff81182b08>] ? trace_hardirqs_on_caller+0x118/0x140
[ 1556.512120] [<ffffffff810729ad>] sys_mmap+0x1d/0x20
[ 1556.512120] [<ffffffff83c88418>] tracesys+0xe1/0xe6
[ 1556.512120] Code: 31 f6 41 55 41 54 49 89 fc 53 31 db 48 83 ec 08 4c 8b 2f 4d 85 ed 74 75 0f 1f 80 00 00 00 00 49 8b 85 88 00 00
00 48 85 c0 74 0e <48> 8b 38 31 f6 48 83 c7 08 e8 0e bc a4 02 49 8b 45 78 4d 8d 7d
[ 1556.512120] RIP [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.512120] RSP <ffff88004fbc7d08>
[ 1556.512120] CR2: 0000000000412000
[ 1557.729958] ---[ end trace d2a29e98cc9e2568 ]---
The bit that's failing is:
struct vm_area_struct *vma = mm->mmap; // mm->mmap = 0x412000
while (vma) {
struct anon_vma_chain *avc;
vma_lock_anon_vma(vma); // BOOM!
Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/