Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node

From: Dave Hansen
Date: Tue Apr 16 2019 - 10:30:24 EST


On 4/16/19 12:47 AM, Michal Hocko wrote:
> You definitely have to follow policy. You cannot demote to a node which
> is outside of the cpuset/mempolicy because you are breaking contract
> expected by the userspace. That implies doing a rmap walk.

What *is* the contract with userspace, anyway? :)

Obviously, the preferred policy doesn't have any strict contract.

The strict binding has a bit more of a contract, but it doesn't prevent
swapping. Strict binding also doesn't keep another app from moving the
memory.

We have a reasonable argument that demotion is better than swapping.
So, we could say that even if a VMA has a strict NUMA policy, demoting
pages mapped there pages still beats swapping them or tossing the page
cache. It's doing them a favor to demote them.

Or, maybe we just need a swap hybrid where demotion moves the page but
keeps it unmapped and in the swap cache. That way an access gets a
fault and we can promote the page back to where it should be. That
would be faster than I/O-based swap for sure.

Anyway, I agree that the kernel probably shouldn't be moving pages
around willy-nilly with no consideration for memory policies, but users
might give us some wiggle room too.