RE: [PATCH] vfio iommu type1: Bypass the vma permission check in vfio_pin_pages_remote()
From: Justin He
Date: Tue Nov 24 2020 - 20:06:00 EST
Hi Peter
> -----Original Message-----
> From: Peter Xu <peterx@xxxxxxxxxx>
> Sent: Wednesday, November 25, 2020 2:12 AM
> To: Justin He <Justin.He@xxxxxxx>
> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; Cornelia Huck
> <cohuck@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] vfio iommu type1: Bypass the vma permission check in
> vfio_pin_pages_remote()
>
> Hi, Jia,
>
> On Thu, Nov 19, 2020 at 10:27:37PM +0800, Jia He wrote:
> > The permission of vfio iommu is different and incompatible with vma
> > permission. If the iotlb->perm is IOMMU_NONE (e.g. qemu side), qemu will
> > simply call unmap ioctl() instead of mapping. Hence vfio_dma_map() can't
> > map a dma region with NONE permission.
> >
> > This corner case will be exposed in coming virtio_fs cache_size
> > commit [1]
> > - mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > memory_region_init_ram_ptr()
> > - re-mmap the above area with read/write authority.
>
> If iiuc here we'll remap the above PROT_NONE into PROT_READ|PROT_WRITE,
> then...
>
> > - vfio_dma_map() will be invoked when vfio device is hotplug added.
>
> ... here I'm slightly confused on why VFIO_IOMMU_MAP_DMA would encounter
> vma
> check fail - aren't they already get rw permissions?
No, we haven't got the vma rw permission yet, but the default permission in
this case is rw by default.
When qemu side invoke vfio_dma_map(), the rw of iommu will be automatically
added [1] [2] (currently map a NONE region is not supported in qemu vfio).
[1] https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=6ff1daa763f87a1ed5351bcc19aeb027c43b8a8f;hb=HEAD#l479
[2] https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=6ff1daa763f87a1ed5351bcc19aeb027c43b8a8f;hb=HEAD#l486
But at kernel side, the vma permission is created by PROT_NONE.
Then the check in check_vma_flags() at [3] will be failed.
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/gup.c#n929
>
> I'd appreciate if you could explain why vfio needs to dma map some
> PROT_NONE
Virtiofs will map a PROT_NONE cache window region firstly, then remap the sub
region of that cache window with read or write permission. I guess this might
be an security concern. Just CC virtiofs expert Stefan to answer it more accurately.
--
Cheers,
Justin (Jia He)
> pages after all, and whether QEMU would be able to postpone the vfio map of
> those PROT_NONE pages until they got to become with RW permissions.
>
> Thanks,
>
> --
> Peter Xu
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.