Re: [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable
From: Kirill Tkhai
Date: Thu May 03 2018 - 06:53:07 EST
On 27.04.2018 21:05, Eric W. Biederman wrote:
> Michal Hocko <mhocko@xxxxxxxxxx> writes:
>
>> On Thu 26-04-18 21:28:18, Michal Hocko wrote:
>>> On Thu 26-04-18 11:19:33, Eric W. Biederman wrote:
>>>> Michal Hocko <mhocko@xxxxxxxxxx> writes:
>>>>
>>>>> I've had a patch to remove owner few years back. It needed some work
>>>>> to finish but maybe that would be a better than try to make
>>>>> non-scalable thing suck less.
>>>>
>>>> I have a question. Would it be reasonable to just have a mm->memcg?
>>>> That would appear to be the simplest solution to the problem.
>>>
>>> I do not remember details. Have to re-read the whole thing again. Hope
>>> to get to this soon but with the current jet lag and backlog from the
>>> LSFMM I rather not promis anything. Going with mm->memcg would be the
>>> most simple of course but I have a very vague recollection that it was
>>> not possible. Maybe I misremember...
>>
>> Just for the record, the last version where I've tried to remove owner
>> was posted here: http://lkml.kernel.org/r/1436358472-29137-1-git-send-email-mhocko@xxxxxxxxxx
>>
>> I didn't get to remember details yet, but the primary problem was the
>> task migration between cgroups and the nasty case when different thread
>> grounds share the mm. At some point I just suggested to not care
>> about semantic of these weird threads all that much. We can either
>> migrate all tasks sharing the mm struct or just keep the inconsistency.
>>
>> Anyway, removing this ugliness would be so cool!
>
> I suspect the only common user of CLONE_VM today is vfork. And I do
> think it is crazy to migrate a process that has called vfork before
> calling exec. Other useses of CLONE_VM seem even crazier.
>
> I think the easiest change to make in mem_cgroup_can_attach would
> be just to change the test for when charges are migrated. AKA
>
> from:
>
> if (mm->owner == p) {
> ....
> }
>
> to
> if (mem_cgroup_from_task(p) == mm->memcg) {
> ...
> }
>
> That allows using mm->memcg with no new limitations on when migration
> can be called. In crazy cases that has the potential to change which
> memcgroup the charges are accounted to, but the choice is already
> somewhat arbitrary so I don't think that will be a problem. Especially
> given that mm_update_next_owner does not migrate charges if the next
> owner is in a different memory cgroup. A mm with tasks using it in
> two different cgroups is already questionable if not outright
> problematic.
>
>
> Kirill Tkhai do you think you would be able adapt Michal Hoko's old
> patch at https://marc.info/?l=linux-kernel&m=143635857131756&w=2
> that replaces mm->owner with mm->memcg?
I was at vacation. Sorry for the late reply.
> We probably want to outlaw migrating an mm where we are not migrating
> all of the mm->users eventually. Just because that case is crazy.
> But it doesn't look like we need to do that to fix the memory control
> group data structures.
Kirill