Re: [RFC 0/3] Implementation of cgroup isolation

From: Ying Han
Date: Wed Mar 30 2011 - 13:59:30 EST

Next message: Geert Uytterhoeven: "Re: [PATCH] m68k: fix find_next bitops"
Previous message: Yinghai Lu: "Re: another pagetable initialization crash on xen"
In reply to: Michal Hocko: "Re: [RFC 0/3] Implementation of cgroup isolation"
Next in thread: Michal Hocko: "Re: [RFC 0/3] Implementation of cgroup isolation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Mar 30, 2011 at 1:18 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
> On Tue 29-03-11 21:23:10, Balbir Singh wrote:
>> On 03/28/11 16:33, KAMEZAWA Hiroyuki wrote:
>> > On Mon, 28 Mar 2011 11:39:57 +0200
>> > Michal Hocko <mhocko@xxxxxxx> wrote:
> [...]
>> > Isn't it the same result with the case where no cgroup is used ?
>> > What is the problem ?
>> > Why it's not a problem of configuration ?
>> > IIUC, you can put all logins to some cgroup by using cgroupd/libgcgroup.
>> >
>>
>> I agree with Kame, I am still at loss in terms of understand the use
>> case, I should probably see the rest of the patches
>
> OK, it looks that I am really bad at explaining the usecase. Let's try
> it again then (hopefully in a better way).
>
> Consider a service which serves requests based on the in-memory
> precomputed or preprocessed data.
> Let's assume that getting data into memory is rather costly operation
> which considerably increases latency of the request processing. Memory
> access can be considered random from the system POV because we never
> know which requests will come from outside.
> This workflow will benefit from having the memory resident as long as
> and as much as possible because we have higher chances to be used more
> often and so the initial costs would pay off.
> Why is mlock not the right thing to do here? Well, if the memory would
> be locked and the working set would grow (again this depends on the
> incoming requests) then the application would have to unlock some
> portions of the memory or to risk OOM because it basically cannot
> overcommit.
> On the other hand, if the memory is not mlocked and there is a global
> memory pressure we can have some part of the costly memory swapped or
> paged out which will increase requests latencies. If the application is
> placed into an isolated cgroup, though, the global (or other cgroups)
> activity doesn't influence its cgroup thus the working set of the
> application.

> If we compare that to mlock we will benefit from per-group reclaim when
> we get over the limit (or soft limit). So we do not start evicting the
> memory unless somebody makes really pressure on the _application_.
> Cgroup limits would, of course, need to be selected carefully.
>
> There might be other examples when simply kernel cannot know which
> memory is important for the process and the long unused memory is not
> the ideal choice.

Michal,

Reading through your example, sounds to me you can accomplish the
"guarantee" of the high priority service using existing
memcg mechanisms.

Assume you have the service named cgroup-A which needs memory
"guarantee". Meantime we want to launch cgroup-B with no memory
"guarantee". What you want is to have cgroup-B uses the slack memory
(not being allocated by cgroup-A), but also volunteer to give up under
system memory pressure.

So continue w/ my previous post, you can consider the following
configuration in 32G machine. We can only have resident size of
cgroup-A as much as the machine capacity.

cgroup-A : limit_in_bytes =32G soft_limit_in_bytes = 32G
cgroup-B : limit_in_bytes =20G soft_limit_in_bytes = 0G

To be a little bit extreme, there shouldn't be memory pressure on
cgroup-A unless it grows above the machine capacity. If the global
memory contention is triggered by cgroup-B, we should steal pages from
it always.

However, the current implementation of soft_limit needs to be improved
for the example above. Especially when we start having lots of cgroups
running w/ different limit setting, we need to have soft_limit being
efficient and we can eliminate the global lru scanning. The later one
breaks the isolation.

--Ying

> Makes sense?
> --
> Michal Hocko
> SUSE Labs
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9
> Czech Republic
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Geert Uytterhoeven: "Re: [PATCH] m68k: fix find_next bitops"
Previous message: Yinghai Lu: "Re: another pagetable initialization crash on xen"
In reply to: Michal Hocko: "Re: [RFC 0/3] Implementation of cgroup isolation"
Next in thread: Michal Hocko: "Re: [RFC 0/3] Implementation of cgroup isolation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]