RE: [RFC] /dev/ioasid uAPI proposal
From: Parav Pandit
Date: Wed Jun 02 2021 - 08:42:02 EST
> From: Enrico Weigelt, metux IT consult <lkml@xxxxxxxxx>
> Sent: Wednesday, June 2, 2021 2:09 PM
>
> On 31.05.21 19:37, Parav Pandit wrote:
>
> > It appears that this is only to make map ioctl faster apart from accounting.
> > It doesn't have any ioasid handle input either.
> >
> > In that case, can it be a new system call? Why does it have to be under
> /dev/ioasid?
> > For example few years back such system call mpin() thought was proposed
> in [1].
>
> I'm very reluctant to more syscall inflation. We already have lots of syscalls
> that could have been easily done via devices or filesystems (yes, some of
> them are just old Unix relics).
>
> Syscalls don't play well w/ modules, containers, distributed systems, etc, and
> need extra low-level code for most non-C languages (eg.
> scripting languages).
Likely, but as per my understanding, this ioctl() is a wrapper to device agnostic code as,
{
atomic_inc(mm->pinned_vm);
pin_user_pages();
}
And mm must got to hold the reference to it, so that these pages cannot be munmap() or freed.
And second reason I think (I could be wrong) is that, second level page table for a PASID, should be same as what process CR3 has used.
Essentially iommu page table and mmu page table should be pointing to same page table entry.
If they are different, than even if the guest CPU has accessed the pages, device access via IOMMU will result in an expensive page faults.
So assuming both cr3 and pasid table entry points to same page table, I fail to understand for the need of extra refcount and hence driver specific ioctl().
Though I do not have strong objection to the ioctl(). But want to know what it will and will_not do.
Io uring fs has similar ioctl() doing io_sqe_buffer_register(), pinning the memory.