Re: [RFC 0/3] Implementation of cgroup isolation

From: Zhu Yanhai
Date: Tue Mar 29 2011 - 10:08:24 EST


2011/3/29 Zhu Yanhai <zhu.yanhai@xxxxxxxxx>:
> Hi,
>
> 2011/3/29 Michal Hocko <mhocko@xxxxxxx>:
>> Isn't this an overhead that would slow the whole thing down. Consider
>> that you would need to lookup page_cgroup for every page and touch
>> mem_cgroup to get the limit.
>
> Current almost has did such things, say the direct reclaim path:
> shrink_inactive_list()
> Â ->isolate_pages_global()
> Â Â Â->isolate_lru_pages()
> Â Â Â Â ->mem_cgroup_del_lru(for each page it wants to isolate)
> Â Â Â Â Â Âand in mem_cgroup_del_lru() we have:
oops, the below code is from mem_cgroup_rotate_lru_list not
mem_cgroup_del_lru, the correct one should be:
[code]
pc = lookup_page_cgroup(page);
/* can happen while we handle swapcache. */
if (!TestClearPageCgroupAcctLRU(pc))
return;
VM_BUG_ON(!pc->mem_cgroup);
/*
* We don't check PCG_USED bit. It's cleared when the "page" is finally
* removed from global LRU.
*/
mz = page_cgroup_zoneinfo(pc);
MEM_CGROUP_ZSTAT(mz, lru) -= 1;
if (mem_cgroup_is_root(pc->mem_cgroup))
return;
[/code]
Anyway, the point still stands.

-zyh
> [code]
> Â Â Â Âpc = lookup_page_cgroup(page);
> Â Â Â Â/*
> Â Â Â Â * Used bit is set without atomic ops but after smp_wmb().
> Â Â Â Â * For making pc->mem_cgroup visible, insert smp_rmb() here.
> Â Â Â Â */
> Â Â Â Âsmp_rmb();
> Â Â Â Â/* unused or root page is not rotated. */
> Â Â Â Âif (!PageCgroupUsed(pc) || mem_cgroup_is_root(pc->mem_cgroup))
> Â Â Â Â Â Â Â Âreturn;
> [/code]
> By calling mem_cgroup_is_root(pc->mem_cgroup) we already brought the
> struct mem_cgroup into cache.
> So probably things won't get worse at least.
>
> Thanks,
> Zhu Yanhai
>
>> The point of the isolation is to not touch the global reclaim path at
>> all.
>>
>>> 3) shrink the cgroups who have set a reserve_limit, and leave them with only
>>> the reserve_limit bytes they need. if nr_reclaimed is meet, goto finish.
>>> 4) OOM
>>>
>>> Does it make sense?
>>
>> It sounds like a good thing - in that regard it is more generic than
>> a simple flag - but I am afraid that the implementation wouldn't be
>> that easy to preserve the performance and keep the balance between
>> groups. But maybe it can be done without too much cost.
>>
>> Thanks
>> --
>> Michal Hocko
>> SUSE Labs
>> SUSE LINUX s.r.o.
>> Lihovarska 1060/12
>> 190 00 Praha 9
>> Czech Republic
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/