Re: [PATCH] iommu: iommufd: Explicitly check for VM_PFNMAP in iommufd_ioas_map

From: Shuai Xue

Date: Wed Oct 29 2025 - 10:44:40 EST




在 2025/10/29 21:34, Jason Gunthorpe 写道:
On Wed, Oct 29, 2025 at 08:52:26PM +0800, Shuai Xue wrote:
The iommufd_ioas_map function currently returns -EFAULT when attempting
to map VM_PFNMAP VMAs because pin_user_pages_fast() cannot handle such
mappings. This error code is misleading and does not accurately reflect
the nature of the failure.

Hi, Jason,


Sure, but why do you care? Userspace should know not to do this based
on how it created the mmaps, not rely on errnos to figure it out after
the fact.

We run different VMMs (QEMU, Kata Containers) to meet diverse business
requirements, while our production environment deploys various evolving
kernel versions. Additionally, we are migrating from VFIO Type 1 to
IOMMUFD. Although IOMMUFD claims to provide compatible
iommufd_vfio_ioctl APIs, these APIs are not fully compatible in
practice. For example, with VFIO_IOMMU_MAP_DMA, iommufd_vfio_map_dma
doesn't support MMIO mapping, and we can only rely on the implicit
EFAULT error from pin_user_pages_fast(). (I initially considered adding
explicit checks in iommufd_vfio_map_dma, but I noticed you plan to add
dma_buf support there.)

While we certainly aim for a seamless migration from VFIO Type 1 to
IOMMUFD, as you know, this isn't always feasible.

For GPU-related issues encountered in production, the debugging path is
quite long - from business teams to virtualization teams, and finally to
our kernel team.

Therefore, having explicit checks with deterministic error codes
returned to userspace would be greatly appreciated.


+static bool iommufd_check_vm_pfnmap(unsigned long vaddr)
+{
+ struct mm_struct *mm = current->mm;
+ struct vm_area_struct *vma;
+ bool ret = false;
+
+ mmap_read_lock(mm);
+ vaddr = untagged_addr_remote(mm, vaddr);
+ vma = vma_lookup(mm, vaddr);
+ if (vma && vma->vm_flags & VM_PFNMAP)
+ ret = true;
+ mmap_read_unlock(mm);

This isn't really sufficient, the range can span multiple VMAs and you
can hit special PTEs in PFNMAPs, or you can hit P2P struct pages in
fully normal VMAs.

I think if you really want this errno distinction it should come from
pin_user_pages() directly as only it knows the reason it didn't work.


Aha, I see. Thank you for pointing out this issue. The check indeed
needs to be more comprehensive. Do you mind use pin_user_pages() as a
precheck?

Thanks for quick reply.

Best Regards,
Shuai