On Wed 07-09-22 21:50:24, Zhongkun He wrote:
[...]
Do you really need to change the policy itself or only the effective
nodemask? Do you need any other policy than bind and preferred?
Yes, we need to change the policy, not only his nodemask. we really want
policy is interleave, and extend it to weight-interleave.
Say something like the following
node weight
interleave: 0-3 1:1:1:1 default one by one
weight-interleave: 0-3 1:2:4:6 alloc pages by weight
(User set weight.)
In the actual usecase, the remaining resources of each node are different,
and the use of interleave cannot maximize the use of resources.
OK, this seems a separate topic. It would be good to start by proposing
that new policy in isolation with the semantic description.
Back to the previous question.
The question is how to implement that with a sensible semantic.
Thanks for your analysis and suggestions.It is really difficult to add
policy directly to cgroup for the hierarchical enforcement. It would be a
good idea to add pidfd_set_mempolicy.
Are you going to pursue that path?
Also, there is a new idea.
We can try to separate the elements of mempolicy and use them independently.
Mempolicy has two meanings:
nodes:which nodes to use(nodes,0-3), we can use cpuset's effective_mems
directly.
mode:how to use them(bind,prefer,etc). change the mode to a
cpuset->flags,such as CS_INTERLEAVE。
task_struct->mems_allowed is equal to cpuset->effective_mems,which is
hierarchical enforcement。CS_INTERLEAVE can also be updated into tasks,
just like other flags(CS_SPREAD_PAGE).
When a process needs to allocate memory, it can find the appropriate node to
allocate pages according to the flag and mems_allowed.
I am not sure I see the advantage as the mode and nodes are always
closely coupled. You cannot really have one wihtout the other.