Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged

From: Dominique Martinet
Date: Wed Jan 16 2019 - 01:34:55 EST


Linus Torvalds wrote on Wed, Jan 16, 2019:
> Anybody willing to test the above patch instead? And replace the
>
> || capable(CAP_SYS_ADMIN)
>
> check with something like
>
> || inode_permission(inode, MAY_WRITE) == 0
>
> instead?
>
> (This is obviously after you've reverted the "only check mmap
> residency" patch..)

That seems to work on an x86_64 vm.

I've tested with the attached patch:
- root can lookup pages on any file I tried;
- user can lookup page on file it owns, assuming it can write to it
(e.g. it won't work on a 0400 file you own)
- user cannot lookup pages on e.g. /lib64/libc-2.28.so

There is a difference with your previous patch though, that used to list
no page in core when it didn't know; this patch lists pages as in core
when it refuses to tell. I don't think that's very important, though.

If anything, the 0400 user-owner file might be a problem in some edge
case (e.g. if you're preloading git directories, many objects are 0444);
should we *also* check ownership?...

--
Dominique
mm/mincore.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/mm/mincore.c b/mm/mincore.c
index 218099b5ed31..11ed7064f4eb 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -169,6 +169,13 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
return 0;
}

+static inline bool can_do_mincore(struct vm_area_struct *vma)
+{
+ return vma_is_anonymous(vma)
+ || (vma->vm_file && (vma->vm_file->f_mode & FMODE_WRITE))
+ || inode_permission(file_inode(vma->vm_file), MAY_WRITE) == 0;
+}
+
/*
* Do a chunk of "sys_mincore()". We've already checked
* all the arguments, we hold the mmap semaphore: we should
@@ -189,8 +196,13 @@ static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *v
vma = find_vma(current->mm, addr);
if (!vma || addr < vma->vm_start)
return -ENOMEM;
- mincore_walk.mm = vma->vm_mm;
end = min(vma->vm_end, addr + (pages << PAGE_SHIFT));
+ if (!can_do_mincore(vma)) {
+ unsigned long pages = (end - addr) >> PAGE_SHIFT;
+ memset(vec, 1, pages);
+ return pages;
+ }
+ mincore_walk.mm = vma->vm_mm;
err = walk_page_range(addr, end, &mincore_walk);
if (err < 0)
return err;