Re: [PATCH] sys_remap_file_pages: fix ->vm_file accounting
From: Matt Helsley
Date: Wed Feb 06 2008 - 19:41:22 EST
On Wed, 2008-02-06 at 20:33 +0000, Hugh Dickins wrote:
> On Sun, 3 Feb 2008, Oleg Nesterov wrote:
> >
> > So I have to try to find another bug ;) Suppose that ->load_binary() does
> > a series of do_mmap(MAP_EXECUTABLE). It is possible that mmap_region() can
> > merge 2 vmas. In that case we "leak" ->num_exe_file_vmas. Unless I missed
> > something, mmap_region() should do removed_exe_file_vma() when vma_merge()
> > succeds (near fput(file)).
>
> Or there's the complementary case of a VM_EXECUTABLE vma being
> split in two, for example by an mprotect of a part of it.
>
> Sorry, Matt, I don't like your patch at all. It seems to add a fair
> amount of ugliness and unmaintainablity, all for a peculiar MVFS case
I thought that getting rid of the separate versions of proc_exe_link()
improved maintainability. Do you have any specific details on what you
think makes the code introduced by the patch unmaintainable?
> (you've tried to argue other advantages, but not always convinced!).
Yup -- looking at how the VM_EXECUTABLE flag affects the vma walk it's
clear one of my arguments was wrong. So I can't blame you for being
unconvinced by that. :)
I still think it would help any stacking filesystems that can't use the
solution adopted by unionfs.
> And I found it quite hard to see where the crucial difference comes.
> I guess it's that MVFS changes vma->vm_file in its ->mmap? Well, if
Yup.
> MVFS does that, maybe something else does that too, but precisely to
> rely on the present behaviour of /proc/pid/exe - so in fixing for
> MVFS, we'd be breaking that hypothetical other?
I'm not completely certain that I understand your point. Are you
suggesting that some hypothetical code would want to use this "quirk"
of /proc/pid/exe for a legitimate purpose?
Assuming that is your point, I thought my non-hypothetical java example
clearly demonstrated that at least one non-hypothetical program doesn't
expect the "quirk" and breaks because of it. Frankly,
given /proc/pid/exe's output in the non-stacking case, I can't see how
its output in the stacking case we're discussing could be considered
anything but buggy.
Cheers,
-Matt Helsley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/