Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged

From: Kevin Easton
Date: Tue Jan 08 2019 - 03:51:06 EST


On Sat, Jan 05, 2019 at 01:54:03PM -0800, Linus Torvalds wrote:
> On Sat, Jan 5, 2019 at 12:43 PM Jiri Kosina <jikos@xxxxxxxxxx> wrote:
> >
> > > Who actually _uses_ mincore()? That's probably the best guide to what
> > > we should do. Maybe they open the file read-only even if they are the
> > > owner, and we really should look at file ownership instead.
> >
> > Yeah, well
> >
> > https://codesearch.debian.net/search?q=mincore
> >
> > is a bit too much mess to get some idea quickly I am afraid.
>
> Yeah, heh.
>
> And the first hit is 'fincore', which probably nobody cares about
> anyway, but it does
>
> fd = open (name, O_RDONLY)
> ..
> mmap(window, len, PROT_NONE, MAP_PRIVATE, ..
>
> so if we want to keep that working, we'd really need to actually check
> file ownership rather than just looking at f_mode.
>
> But I don't know if anybody *uses* and cares about fincore, and it's
> particularly questionable for non-root users.
>
...
> I didn't find anything that seems to really care, but I gave up after
> a few pages of really boring stuff.

I've gone through everything in the Debian code search, and this is the
stuff that seems like it would be affected at all by the current patch:

util-linux
Contains 'fincore' as already noted above.

e2fsprogs
e4defrag tries to drop pages that it caused to be loaded into the
page cache, but it's not clear that this ever worked as designed
anyway (it calls mincore() before ioctl(fd, EXT4_IOC_MOVE_EXT ..)
but then after the sync_file_range it drops the pages that *were*
in the page cache at the time of mincore()).

pgfincore
postgresql extension used to try to dump/restore page cache status
of database backing files across reboots. It uses a fresh mapping
with mincore() to try to determine the current page cache status of
a file.

nocache
LD_PRELOAD library that tries to drop any pages that the victim
program has caused to be loaded into the page cache, uses mincore
on a fresh mapping to see what was resident beforehand. Also
includes 'cachestats' command that's basically another 'fincore'.

xfsprogs
xfs_io has a 'mincore' sub-command that is roughly equivalent to
'fincore'.

vmtouch
vmtouch is "Portable file system cache diagnostics and control",
among other things it implements 'fincore' type functionality, and
one of its touted use-cases is "Preserving virtual memory profile
when failing over servers".

qemu
qemu uses mincore() with a fresh PROT_NONE, MAP_PRIVATE mapping to
implement the "x-check-cache-dropped" option.
( https://patchwork.kernel.org/patch/10395865/ )

(Everything else I could see was either looking at anonymous VMAs, its
own existing mapping that it's been using for actual IO, or was just
using mincore() to see if an address was part of any mapping at all).

- Kevin