Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE

From: Konstantin Khlebnikov
Date: Thu Apr 05 2012 - 17:04:46 EST


Matt Helsley wrote:
On Tue, Apr 03, 2012 at 11:32:04PM +0400, Cyrill Gorcunov wrote:
On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote:
On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
Matt Helsley wrote:
On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
On 03/31, Konstantin Khlebnikov wrote:

comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
where all this stuff was introduced:

...
This avoids pinning the mounted filesystem.

So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
fix some hypothetical pinning fs from umounting by mm which already unmapped all
its executable files, but still alive. Does anyone know any real world example?

This is the question to Matt.

This is where I got the scenario:

https://lkml.org/lkml/2007/7/12/398

Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
gives userspace ability to unpin vfsmount explicitly.

Doesn't that break the semantics of the kernel ABI?

Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
again, until program exit.

The prctl() interface itself is fine as it stands now.

As far as I can tell Konstantin is proposing that we remove the unusual
counter that tracks the number of mappings of the exe_file and require
userspace use the prctl() to drop the last reference. That's what I think
will break the ABI because after that change you *must* change userspace
code to use the prctl(). It's an ABI change because the same sequence of
system calls with the same input bits produces different behavior.

But common software does not require this at all. I did not found real examples,
only hypothesis by Al Viro: https://lkml.org/lkml/2007/7/12/398
libhugetlbfs isn't good example too, the man proc says: /proc/[pid]/exe is alive until
main thread is alive, but in case libhugetlbfs /proc/[pid]/exe disappears too early.
Also I would not call it ABI, this corner-case isn't documented, I'm afraid only few
people in the world knows about it =)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/