Re: [RFC PATCH v2 0/5] Reduce NUMA balance caused TLB-shootdowns in a VM

From: John Hubbard
Date: Thu Aug 17 2023 - 22:30:08 EST


On 8/17/23 17:13, Yan Zhao wrote:
...
But consider for GPUs case as what John mentioned, since the memory is
not even pinned, maybe they still need flag VM_NO_NUMA_BALANCING ?
For VMs, we hint VM_NO_NUMA_BALANCING for passthrough devices supporting
IO page fault (so no need to pin), and VM_MAYLONGTERMDMA to avoid misplace
and migration.

Is that good?
Or do you think just a per-mm flag like MMF_NO_NUMA is good enough for
now?


So far, a per-mm setting seems like it would suffice. However, it is
also true that new hardware is getting really creative and large, to
the point that it's not inconceivable that a process might actually
want to let NUMA balancing run in part of its mm, but turn it off
to allow fault-able device access to another part of the mm.

We aren't seeing that yet, but on the other hand, that may be
simply because there is no practical way to set that up and see
how well it works.


thanks,
--
John Hubbard
NVIDIA