Re: mremap use-after-free [was Re: 2.5.52-mm2]

From: Andrew Morton (akpm@digeo.com)
Date: Thu Dec 19 2002 - 13:18:23 EST


Hugh Dickins wrote:
>
> ...
> I doubt that (but may be wrong, I haven't time right now to think as
> twistedly as mremap demands).

Maybe, maybe not. Think about it - because VM_LOCKED is cleared
by slab poisoning, the chances of anyone noticing it are very small.
 
> The code (patently!) expects new_vma
> to be good at the end, it certainly wasn't intending to unmap it;
> but 2.5 split_vma has been through a couple of convulsions, either
> of which might have resulted in the potential for new_vma to be
> freed where before it was guaranteed to remain.

I see no "guarantees" around do_munmap() at all. The whole thing
is foul; no wonder it keeps blowing up.

It died partway into kde startup. It was in fact the first mremap
call.

> Do you know the vmas before and after, and the mremap which did this?

Breakpoint 1, move_vma (vma=0xcf1ed734, addr=1077805056, old_len=36864, new_len=204800, new_addr=1078050816) at mm/mremap.c:176
176 {
(gdb) n
183 next = find_vma_prev(mm, new_addr, &prev);
(gdb)
177 struct mm_struct * mm = vma->vm_mm;
(gdb)
180 int split = 0;
(gdb)
182 new_vma = NULL;
(gdb)
183 next = find_vma_prev(mm, new_addr, &prev);
(gdb)
184 if (next) {
(gdb) p/x next
$1 = 0xce3bee84
(gdb) p/x *next
$2 = {vm_mm = 0xcf7cc624, vm_start = 0xbffcd000, vm_end = 0xc0000000, vm_next = 0x0, vm_page_prot = {pgprot = 0x25},
  vm_flags = 0x100177, vm_rb = {rb_parent = 0xcf1ed74c, rb_color = 0x1, rb_right = 0x0, rb_left = 0x0}, shared = {next = 0xce3beeac,
    prev = 0xce3beeac}, vm_ops = 0x0, vm_pgoff = 0xffffffce, vm_file = 0x0, vm_private_data = 0x0}
(gdb) p prev
$3 = (struct vm_area_struct *) 0xcf1ed734
(gdb) p/x *prev
$4 = {vm_mm = 0xcf7cc624, vm_start = 0x403e0000, vm_end = 0x4041c000, vm_next = 0xce3bee84, vm_page_prot = {pgprot = 0x25},
  vm_flags = 0x100073, vm_rb = {rb_parent = 0xce3be4c4, rb_color = 0x0, rb_right = 0xce3bee9c, rb_left = 0xce3be1ac}, shared = {
    next = 0xcf1ed75c, prev = 0xcf1ed75c}, vm_ops = 0x0, vm_pgoff = 0x0, vm_file = 0x0, vm_private_data = 0x0}

> ...
>
> Hmmm. Am I right to suppose that all the changes above are "cleanup"
> which you couldn't resist making while you looked through this code,
> but entirely irrelevant to the bug in question?

to make the code readable so I could work on the bug, it then "accidentally"
got left in.

> If those mods above
> were right, they should be the subject of a separate patch.

well yes it would be nice to clean that code up. Like, documenting
it and working out what the locking rules are.

What does this do?
                        spin_lock(&mm->page_table_lock);
                        prev->vm_end = new_addr + new_len;
                        spin_unlock(&mm->page_table_lock);
and this?
                        spin_lock(&mm->page_table_lock);
                        next->vm_start = new_addr;
                        spin_unlock(&mm->page_table_lock);

clearly nothing. OK, but is this effectively-unlocked code safe?

> There's certainly room for cleanup there, but my preference would be
> to remove "can_vma_merge" completely, or at least its use in mremap.c,
> using its explicit tests instead. It looks like it was originally
> quite appropriate for a use or two in mmap.c, but obscurely unhelpful
> here - because in itself it is testing a bizarre asymmetric subset of
> what's needed (that subset which remained to be tested in its original
> use in mmap.c).

yes
 
> The problem with your changes above is, you've removed the !vma->vm_file
> tests, presumably because you noticed that can_vma_merge already tests
> !vma->vm_file. But "vma" within can_vma_merge is "prev" or "next" here:
> they are distinct tests, and you're now liable to merge an anonymous
> mapping with a private file mapping - nice if it's from /dev/zero,
> but otherwise not. Please just cut those hunks out.

ok
 
> ...
> > - do_munmap(current->mm, addr, old_len);
> > -
>
> Anguished cry! There was careful manipulation of VM_ACCOUNT before and
> after do_munmap, now you've for no reason moved do_munmap down outside.

How do we know that *vma was not altered by the do_munmap() call?

With this change, nobody will touch any locally-cached vma*'s from
the do_munmap() right through to syscall exit.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Dec 23 2002 - 22:00:24 EST