Re: [PATCH 2/2] time/namespace: Forbid timens page faults under kthread_use_mm()

From: Jann Horn
Date: Thu Dec 01 2022 - 04:32:29 EST


On Wed, Nov 30, 2022 at 11:48 PM David Laight <David.Laight@xxxxxxxxxx> wrote:
> From: Thomas Gleixner
> > Sent: 30 November 2022 00:08
> ....
> > >> None of those VDSO (user space) addresses are subject to be faulted in
> > >> by anything else than the associated user space task(s).
> > >
> > > Are you saying that it's not possible or that it doesn't happen when
> > > userspace is well-behaved?
> >
> > My subconcious self told me that a kthread won't do that unless it's
> > buggered which makes the vdso fault path the least of our problems, but
> > thinking more about it: You are right, that there are ways that the
> > kthread ends up with a vdso page address.... Bah!
> >
> > Still my point stands that this is not a timens VDSO issue, but an issue
> > of: kthread tries to fault in a VDSO page of whatever nature.
>
> Isn't there also the kernel code path where one user thread
> reads data from another processes address space.
> (It does some unusual calls to the iov_import() functions.)
> I can't remember whether it is used by strace or gdb.
> But there is certainly the option of getting to access
> an 'invalid' address in the other process and then faulting.

That's a different mechanism. /proc/$pid/mem and process_vm_readv()
and PTRACE_PEEKDATA and so on go through get_user_pages_remote() or
pin_user_pages_remote(), which bail out on VMAs with VM_IO or
VM_PFNMAP. The ptrace-based access can also fall back to using
vma->vm_ops->access(), but the special_mapping_vmops used by the vvar
VMA explicitly don't have such a handler:

static const struct vm_operations_struct special_mapping_vmops = {
.close = special_mapping_close,
.fault = special_mapping_fault,
.mremap = special_mapping_mremap,
.name = special_mapping_name,
/* vDSO code relies that VVAR can't be accessed remotely */
.access = NULL,
.may_split = special_mapping_split,
};

One path that I'm not sure about is the Intel i915 GPU virtualization
codepath ppgtt_populate_shadow_entry -> intel_gvt_dma_map_guest_page
-> gvt_dma_map_page -> gvt_pin_guest_page -> vfio_pin_pages ->
vfio_iommu_type1_pin_pages -> vfio_pin_page_external -> vaddr_get_pfns
-> follow_fault_pfn -> fixup_user_fault -> handle_mm_fault. That looks
like it might actually be able to trigger pagefault handling on the
vvar mapping from another process.

> ISTR not being convinced that there was a correct check
> for user/kernel addresses in it either.

The get_user_pages_remote() machinery only works on areas that are
mapped by VMAs (__get_user_pages() bails out if find_extend_vma()
fails and the address is not located in the gate area). There are no
VMAs for kernel memory.