Re: [RFC] mm: add new syscall pidfd_set_mempolicy()

From: Frank van der Linden
Date: Tue Oct 11 2022 - 13:22:42 EST


On Tue, Oct 11, 2022 at 8:00 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Mon 10-10-22 09:22:13, Frank van der Linden wrote:
> > For consistency with process_madvise(), I would suggest calling it
> > process_set_mempolicy.
>
> This operation has per-thread rather than per-process semantic so I do
> not think your proposed naming is better.

True. I suppose you could argue that it should have been
pidfd_madvise() then for consistency, but that ship has sailed.

>
> > Other than that, this makes sense. To complete
> > the set, perhaps a process_mbind() should be added as well. What do
> > you think?
>
> Is there any real usecase for this interface? How is the caller supposed
> to make per-range decisions without a very involved coordination with
> the target process?

The use case for a potential pidfd_mbind() is basically a combination
of what is described for in the process_madvise proposal (
https://lore.kernel.org/lkml/20200901000633.1920247-1-minchan@xxxxxxxxxx/
), and what this proposal describes: system management software acting
as an orchestrator that has a better overview of the system as a whole
(NUMA nodes, memory tiering), and has knowledge of the layout of the
processes involved.

pidfd_mbind() makes sense to me, since the notion of an external
agent with knowledge of the VM layout is already there with
process_madvise(). And since set_mempolicy and mbind are closely
related, it would seem logical to add an mbind variant as well as
pidfd_set_mempolicy().

Having said that, I'm fine with leaving that discussion for another time.

- Frank