Re: [patch 1/2] mm, memcg: avoid oom notification when current needsaccess to memory reserves

From: David Rientjes
Date: Thu Jan 09 2014 - 19:24:09 EST


On Thu, 9 Jan 2014, Andrew Morton wrote:

> > > It was dropped because the other memcg developers disagreed with it.
> > >
> >
> > It was acked-by Michal.
>
> And Johannes?
>

Johannes is arguing for the same semantics that VMPRESSURE_CRITICAL and/or
memory thresholds provides, which disagrees from the list of solutions
that Documentation/cgroups/memory.txt gives for userspace oom handler
wakeups and is required for any sane implementation.

> > We REQUIRE this behavior for a sane userspace oom handler implementation.
> > You've snipped my email quite extensively, but I'd like to know
> > specifically how you would implement a userspace oom handler described by
> > Section 10 of Documentation/cgroups/memory.txt without this patch?
>
> From long experience I know that if I suggest an alternative
> implementation, advocates of the initial implementation will invest
> great effort in demonstrating why my suggestion won't work while
> investing zero effort in thinking up alternatives themselves.
>

Easy thing to say when you don't suggest an alternative implementation,
right?

I'm fully aware that I'm the only one in this thread who is charged with
writing and maintaining userspace oom handlers, so I'm not asking for an
actual implementation, but rather an answer to the very simple question:
how does userspace know whether it needs to actually do anything or not
without this patch?

> So the interface is wrong. We have two semantically different kernel
> states which are being communicated to userspace in the same way, so
> userspace cannot disambiguate.
>

We want to notify on one state, which is what is described in
Documentation/cgroups/memory.txt and works with my patch, and not notify
on another state which was broken by ME in f9434ad15524 ("memcg: give
current access to memory reserves if it's trying to die"). Am I allowed
to fix my own breakage?

Userspace expects to get notified for the reasons listed in the
documentation, not when the kernel is going to allow memory to be freed
itself. You can get notification of oom through vmpressure or memory
thresholds, memory.oom_control needs to be reserved for situations when
"something" needs to be done by userspace and as defined by the
documentation.

> Solution: invent a better communication scheme with a richer payload.
> Use that, deprecate the old interface if poss.
>

There are better communication schemes for oom conditions that are not
actionable, they are memcg memory threshold notifications and vmpressure.

> Johannes' final email in this thread has yet to be replied to, btw.
>

Will do.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/