Re: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained access control

From: Axel Rasmussen
Date: Wed Jul 20 2022 - 19:05:28 EST


On Wed, Jul 20, 2022 at 3:16 PM Schaufler, Casey
<casey.schaufler@xxxxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
> > Sent: Tuesday, July 19, 2022 12:56 PM
> > To: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>; Andrew Morton
> > <akpm@xxxxxxxxxxxxxxxxxxxx>; Dave Hansen
> > <dave.hansen@xxxxxxxxxxxxxxx>; Dmitry V . Levin <ldv@xxxxxxxxxxxx>; Gleb
> > Fotengauer-Malinovskiy <glebfm@xxxxxxxxxxxx>; Hugh Dickins
> > <hughd@xxxxxxxxxx>; Jan Kara <jack@xxxxxxx>; Jonathan Corbet
> > <corbet@xxxxxxx>; Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>; Mike
> > Kravetz <mike.kravetz@xxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxx>;
> > Amit, Nadav <namit@xxxxxxxxxx>; Peter Xu <peterx@xxxxxxxxxx>;
> > Shuah Khan <shuah@xxxxxxxxxx>; Suren Baghdasaryan
> > <surenb@xxxxxxxxxx>; Vlastimil Babka <vbabka@xxxxxxx>; zhangyi
> > <yi.zhang@xxxxxxxxxx>
> > Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>; linux-
> > doc@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
> > kselftest@xxxxxxxxxxxxxxx
> > Subject: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained
> > access control
>
> I assume that leaving the LSM mailing list off of the CC is purely
> accidental. Please, please include us in the next round.

Honestly it just hadn't occurred to me, but I'm more than happy to CC
it on future revisions.

>
> >
> > This series is based on torvalds/master.
> >
> > The series is split up like so:
> > - Patch 1 is a simple fixup which we should take in any case (even by itself).
> > - Patches 2-6 add the feature, configurable selftest support, and docs.
> >
> > Why not ...?
> > ============
> >
> > - Why not /proc/[pid]/userfaultfd? The proposed use case for this is for one
> > process to open a userfaultfd which can intercept another process' page
> > faults. This seems to me like exactly what CAP_SYS_PTRACE is for, though,
> > so I
> > think this use case can simply use a syscall without the powers
> > CAP_SYS_PTRACE
> > grants being "too much".
> >
> > - Why not use a syscall? Access to syscalls is generally controlled by
> > capabilities. We don't have a capability which is used for userfaultfd access
> > without also granting more / other permissions as well, and adding a new
> > capability was rejected [1].
> >
> > - It's possible a LSM could be used to control access instead. I suspect
> > adding a brand new one just for this would be rejected,
>
> You won't know if you don't ask.

Fair enough - I wonder if MM folks (Andrew, Peter, Nadav especially)
would find that approach more palatable than /proc/[pid]/userfaultfd?
Would it make sense from our perspective to propose a userfaultfd- or
MM-specific LSM for controlling access to certain features?

I remember +Andrea saying Red Hat was also interested in some kind of
access control mechanism like this. Would one or the other approach be
more convenient for you?

>
> > but I think some
> > existing ones like SELinux can be used to filter syscall access. Enabling
> > SELinux for large production deployments which don't already use it is
> > likely to be a huge undertaking though, and I don't think this use case by
> > itself is enough to motivate that kind of architectural change.
> >
> > Changelog
> > =========
> >
> > v3->v4:
> > - Picked up an Acked-by on 5/5.
> > - Updated cover letter to cover "why not ...".
> > - Refactored userfaultfd_allowed() into userfaultfd_syscall_allowed().
> > [Peter]
> > - Removed obsolete comment from a previous version. [Peter]
> > - Refactored userfaultfd_open() in selftest. [Peter]
> > - Reworded admin-guide documentation. [Mike, Peter]
> > - Squashed 2 commits adding /dev/userfaultfd to selftest and making
> > selftest
> > configurable. [Peter]
> > - Added "syscall" test modifier (the default behavior) to selftest. [Peter]
> >
> > v2->v3:
> > - Rebased onto linux-next/akpm-base, in order to be based on top of the
> > run_vmtests.sh refactor which was merged previously.
> > - Picked up some Reviewed-by's.
> > - Fixed ioctl definition (_IO instead of _IOWR), and stopped using
> > compat_ptr_ioctl since it is unneeded for ioctls which don't take a pointer.
> > - Removed the "handle_kernel_faults" bool, simplifying the code. The result
> > is
> > logically equivalent, but simpler.
> > - Fixed userfaultfd selftest so it returns KSFT_SKIP appropriately.
> > - Reworded documentation per Shuah's feedback on v2.
> > - Improved example usage for userfaultfd selftest.
> >
> > v1->v2:
> > - Add documentation update.
> > - Test *both* userfaultfd(2) and /dev/userfaultfd via the selftest.
> >
> > [1]: https://lore.kernel.org/lkml/686276b9-4530-2045-6bd8-
> > 170e5943abe4@xxxxxxxxxxxxxxxx/T/
> >
> > Axel Rasmussen (5):
> > selftests: vm: add hugetlb_shared userfaultfd test to run_vmtests.sh
> > userfaultfd: add /dev/userfaultfd for fine grained access control
> > userfaultfd: selftests: modify selftest to use /dev/userfaultfd
> > userfaultfd: update documentation to describe /dev/userfaultfd
> > selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh
> >
> > Documentation/admin-guide/mm/userfaultfd.rst | 41 +++++++++++-
> > Documentation/admin-guide/sysctl/vm.rst | 3 +
> > fs/userfaultfd.c | 69 ++++++++++++++++----
> > include/uapi/linux/userfaultfd.h | 4 ++
> > tools/testing/selftests/vm/run_vmtests.sh | 11 +++-
> > tools/testing/selftests/vm/userfaultfd.c | 69 +++++++++++++++++---
> > 6 files changed, 169 insertions(+), 28 deletions(-)
> >
> > --
> > 2.37.0.170.g444d1eabd0-goog
>