Re: [patch 2/2] mm: memcg: do not allow task about to OOM kill tobypass the limit

From: David Rientjes
Date: Wed Dec 04 2013 - 18:52:25 EST


On Wed, 4 Dec 2013, Johannes Weiner wrote:

> 4942642080ea ("mm: memcg: handle non-error OOM situations more
> gracefully") allowed tasks that already entered a memcg OOM condition
> to bypass the memcg limit on subsequent allocation attempts hoping
> this would expedite finishing the page fault and executing the kill.
>
> David Rientjes is worried that this breaks memcg isolation guarantees
> and since there is no evidence that the bypass actually speeds up
> fault processing just change it so that these subsequent charge
> attempts fail outright. The notable exception being __GFP_NOFAIL
> charges which are required to bypass the limit regardless.
>
> Reported-by: David Rientjes <rientjes@xxxxxxxxxx>
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Acked-by: David Rientjes <rientjes@xxxxxxxxxx>

Thanks!

I think we should consider marking this for stable@xxxxxxxxxxxxxxx for
3.12 since the original patch went into 3.12-rc6. Depending on the number
of allocators in the oom memcg, the amount of memory bypassed can become
quite large.

For example, in a memcg with a limit of 128MB, if you start 10 concurrent
processes that simply allocate a lot of memory you can get quite a bit of
memory bypassed. If I start 10 membench processes, which would cause a
128MB memcg to oom even if only one such process were running, we get:

# grep RSS /proc/1092[0-9]/status
VmRSS: 15724 kB
VmRSS: 15064 kB
VmRSS: 13224 kB
VmRSS: 14520 kB
VmRSS: 14472 kB
VmRSS: 13016 kB
VmRSS: 13024 kB
VmRSS: 14560 kB
VmRSS: 14864 kB
VmRSS: 14772 kB

And all of those total ~140MB of memory while bound to a memcg with a
128MB limit, about 10% of memory is bypassed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/