Re: [HELP] How to get task_struct from mm

From: Yang Shi
Date: Fri May 31 2019 - 08:54:55 EST




On 5/30/19 11:41 PM, Michal Hocko wrote:
On Thu 30-05-19 14:57:46, Yang Shi wrote:
Hi folks,


As what we discussed about page demotion for PMEM at LSF/MM, the demotion
should respect to the mempolicy and allowed mems of the process which the
page (anonymous page only for now) belongs to.
cpusets memory mask (aka mems_allowed) is indeed tricky and somehow
awkward. It is inherently an address space property and I never
understood why we have it per _thread_. This just doesn't make any
sense to me. This just leads to weird corner cases. What should happen
if different threads disagree about the allocation affinity while
working on a shared address space?

I'm supposed (just my guess) such restriction should just apply for the first allocation. Just like memcg charge, who does it first, whose policy gets applied.

The vma that the page is mapped to can be retrieved from rmap walk easily,
but we need know the task_struct that the vma belongs to. It looks there is
not such API, and container_of seems not work with pointer member.
I do not think this is a good idea. As you point out in the reply we
have that for memcgs but we really hope to get rid of mm->owner there
as well. It is just more tricky there. Moreover such a reverse mapping
would be incorrect. Just think of a disagreeing yet overlapping cpusets
for different threads mapping the same page.

Is it such a big deal to document that the node migrate is not
compatible with cpusets?

Not only cpuset, but get_vma_policy() also needs find task_struct from vma. Currently, get_vma_policy() just uses "current", so it just returns the current process's mempolicy if the vma doesn't have mempolicy. For the node migrate case, "current" is definitely not correct.

It looks there is not an easy way to workaround it unless we claim node migrate is not compatible with both cpusets and mempolicy.