Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin

From: Zhou Wang
Date: Tue Feb 09 2021 - 07:21:15 EST


On 2021/2/9 20:01, Greg KH wrote:
> On Tue, Feb 09, 2021 at 07:58:15PM +0800, Zhou Wang wrote:
>> On 2021/2/9 17:37, Greg KH wrote:
>>> On Tue, Feb 09, 2021 at 05:17:46PM +0800, Zhou Wang wrote:
>>>> On 2021/2/8 6:02, Andy Lutomirski wrote:
>>>>>
>>>>>
>>>>>> On Feb 7, 2021, at 12:31 AM, Zhou Wang <wangzhou1@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> SVA(share virtual address) offers a way for device to share process virtual
>>>>>> address space safely, which makes more convenient for user space device
>>>>>> driver coding. However, IO page faults may happen when doing DMA
>>>>>> operations. As the latency of IO page fault is relatively big, DMA
>>>>>> performance will be affected severely when there are IO page faults.
>>>>>> From a long term view, DMA performance will be not stable.
>>>>>>
>>>>>> In high-performance I/O cases, accelerators might want to perform
>>>>>> I/O on a memory without IO page faults which can result in dramatically
>>>>>> increased latency. Current memory related APIs could not achieve this
>>>>>> requirement, e.g. mlock can only avoid memory to swap to backup device,
>>>>>> page migration can still trigger IO page fault.
>>>>>>
>>>>>> Various drivers working under traditional non-SVA mode are using
>>>>>> their own specific ioctl to do pin. Such ioctl can be seen in v4l2,
>>>>>> gpu, infiniband, media, vfio, etc. Drivers are usually doing dma
>>>>>> mapping while doing pin.
>>>>>>
>>>>>> But, in SVA mode, pin could be a common need which isn't necessarily
>>>>>> bound with any drivers, and neither is dma mapping needed by drivers
>>>>>> since devices are using the virtual address of CPU. Thus, It is better
>>>>>> to introduce a new common syscall for it.
>>>>>>
>>>>>> This patch leverages the design of userfaultfd and adds mempinfd for pin
>>>>>> to avoid messing up mm_struct. A fd will be got by mempinfd, then user
>>>>>> space can do pin/unpin pages by ioctls of this fd, all pinned pages under
>>>>>> one file will be unpinned in file release process. Like pin page cases in
>>>>>> other places, can_do_mlock is used to check permission and input
>>>>>> parameters.
>>>>>
>>>>>
>>>>> Can you document what the syscall does?
>>>>
>>>> Will add related document in Documentation/vm.
>>>
>>> A manpage is always good, and will be required eventually :)
>>
>> manpage is maintained in another repo. Do you mean add a manpage
>> patch in this series?
>
> It's good to show how it will be used, don't you think?

Agree, will add it in next version.

Thanks,
Zhou

>
> thanks,
>
> greg k-h
>
> .
>