Re: [RFC PATCH 0/10] Another Approach to Use PMEM as NUMA Node

From: Michal Hocko
Date: Thu Mar 28 2019 - 16:40:26 EST


On Thu 28-03-19 12:40:14, Yang Shi wrote:
>
>
> On 3/28/19 12:12 PM, Michal Hocko wrote:
> > On Thu 28-03-19 11:58:57, Yang Shi wrote:
> > >
> > > On 3/27/19 11:58 PM, Michal Hocko wrote:
> > > > On Wed 27-03-19 19:09:10, Yang Shi wrote:
> > > > > One question, when doing demote and promote we need define a path, for
> > > > > example, DRAM <-> PMEM (assume two tier memory). When determining what nodes
> > > > > are "DRAM" nodes, does it make sense to assume the nodes with both cpu and
> > > > > memory are DRAM nodes since PMEM nodes are typically cpuless nodes?
> > > > Do we really have to special case this for PMEM? Why cannot we simply go
> > > > in the zonelist order? In other words why cannot we use the same logic
> > > > for a larger NUMA machine and instead of swapping simply fallback to a
> > > > less contended NUMA node? It can be a regular DRAM, PMEM or whatever
> > > > other type of memory node.
> > > Thanks for the suggestion. It makes sense. However, if we don't specialize a
> > > pmem node, its fallback node may be a DRAM node, then the memory reclaim may
> > > move the inactive page to the DRAM node, it sounds not make too much sense
> > > since memory reclaim would prefer to move downwards (DRAM -> PMEM -> Disk).
> > There are certainly many details to sort out. One thing is how to handle
> > cpuless nodes (e.g. PMEM). Those shouldn't get any direct allocations
> > without an explicit binding, right? My first naive idea would be to only
>
> Wait a minute. I thought we were arguing about the default allocation node
> mask yesterday. And, the conclusion is PMEM node should not be excluded from
> the node mask. PMEM nodes are cpuless nodes. I think I should replace all
> "PMEM node" to "cpuless node" in the cover letter and commit logs to make it
> explicitly.

No, this is not about the default allocation mask at all. Your
allocations start from a local/mempolicy node. CPUless nodes thus cannot be a
primary node so it will always be only in a fallback zonelist without an
explicit binding.

--
Michal Hocko
SUSE Labs