Re: [PATCH v3 1/2] mm/process_vm_access: pidfd and nowait support for process_vm_readv/writev
From: Alban Crequy
Date: Thu May 14 2026 - 10:43:47 EST
Hi David,
(Apologies for the earlier HTML email â?? resending in plain text.)
Thanks for the review. I'm addressing all your comments in v4.
> What is the expected return value to user space if we run into this case?
If the first page is not resident and PROCESS_VM_NOWAIT is set, it returns -1
with errno EFAULT.
Note that Sashiko suggested to return errno EAGAIN to distinguish
invalid addresses from unpaged memory. I haven't done that to avoid
adding more code tweaking errnos; let me know if you would like me to do
that.
https://sashiko.dev/#/patchset/20260428122826.339550-1-alban.crequy%40gmail.com
It can also return a short read, meaning it returns the number of bytes
successfully transferred from pages prior to the fault.
This is the same partial-read behavior as a regular
process_vm_readv when hitting an unmapped page â?? NOWAIT just changes
when the fault fails (immediately instead of blocking).
I've documented this in the updated commit message in v4 (I will send it
shortly). I also added two selftests that verify partial reads across a
resident and a non-resident page (with both single iovec and two iovecs).
Interestingly, while writing these tests, I found that the current
process_vm_readv(2) man page incorrectly claims that partial transfers
apply at iovec-element granularity. In practice, the kernel returns
partial reads at page granularity within a single remote iovec element.
I've sent a separate fix for that to linux-man@:
https://lore.kernel.org/linux-man/20260514083659.139971-1-alban.crequy@xxxxxxxxx/T/#u
> And in the same context: Will you send a man page update? :)
I've included a suggested man page update in the commit message of v4 patch
1/2, ready for the man-pages maintainer to pick up. I'll also send a separate
patch to linux-man@ once the kernel patches are merged and available in a new
Linux release.
> We try to sort this alphabetically.
Fixed in v4 â?? moved the include/uapi line to the correct position.
> Thinking out loud: the c file is called "process_vm_access.c", should
> we name the header like that as well?
Good idea â?? renamed to include/uapi/linux/process_vm_access.h in v4.
> Should we use BIT here?
As noted in the v2/v3 changelog: BIT() is defined in vdso/bits.h which is not
exported to userspace. UAPI headers using BIT() (like tcp.h) work in-kernel but
break for userspace programs that include these headers directly. In practice,
BIT() in tcp.h is only used by TCP_ACCECN_* flags, so merely including tcp.h in
userspace programs won't break them unless they use those flags. Those flags
were moved from kernel-only headers to UAPI headers in commit 4fa4ac5e5848
("tcp: accecn: add tcpi_ecn_mode and tcpi_option2 in tcp_info"), which appeared
in Linux v7.0-rc1 (not yet in a release). So userspace programs don't yet use
those macros.
So I think (1UL << N) is the correct choice for UAPI.
> This could return -EBADF or -ESRCH. We should document both in the
> man page. (or decide to always return -ESRCH, dunno)
Agreed with Christian's reply â?? keeping the actual errno from
pidfd_get_task() is useful information for userspace. Both EBADF
and ESRCH are documented in the suggested man page update text in v4.
Thanks,
Alban