Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall

From: Arnd Bergmann
Date: Sat Sep 09 2023 - 04:08:02 EST


On Thu, Sep 7, 2023, at 09:54, Gregory Price wrote:
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl
> b/arch/x86/entry/syscalls/syscall_32.tbl
> index 2d0b1bd866ea..25db6d71af0c 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -457,3 +457,4 @@
> 450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node
> 451 i386 cachestat sys_cachestat
> 452 i386 fchmodat2 sys_fchmodat2
> +454 i386 move_phys_pages sys_move_phys_pages
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl
> b/arch/x86/entry/syscalls/syscall_64.tbl
> index 1d6eee30eceb..9676f2e7698c 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -375,6 +375,7 @@
> 451 common cachestat sys_cachestat
> 452 common fchmodat2 sys_fchmodat2
> 453 64 map_shadow_stack sys_map_shadow_stack
> +454 common move_phys_pages sys_move_phys_pages

Doing only x86 is fine for review purposes, but note that
once there is consensus on actually merging it and the syscall
number, you should do the same for all architectures. I think
most commonly we do one patch to add the code and another
patch to hook it up to all the syscall.tbl files and the
include/uapi/asm-generic/unistd.h file.

> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 22bc6bc147f8..6860675a942f 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -821,6 +821,11 @@ asmlinkage long sys_move_pages(pid_t pid, unsigned
> long nr_pages,
> const int __user *nodes,
> int __user *status,
> int flags);
> +asmlinkage long sys_move_phys_pages(unsigned long nr_pages,
> + const void __user * __user *pages,
> + const int __user *nodes,
> + int __user *status,
> + int flags);

The prototype looks good from a portability point of view,
i.e. I made sure this is portable across CONFIG_COMPAT
architectures etc.

What I'm not sure about is whether the representation of physical
memory pages as a 'void *' is a good choice, I can see potential
problems with this:

- it's not really a pointer, but instead a shifted PFN number
in the implementation

- physical addresses may be wider than pointers on 32-bit
architectures with CONFIG_PHYS_ADDR_T_64BIT

I'm also not sure where the user space caller gets the
physical addresses from, are those not intentionally kept
hidden from userspace?

Arnd