Re: [PATCH RESEND] mm: don't raise MEMCG_OOM event due to failed high-order allocation

From: Michal Hocko
Date: Tue Sep 25 2018 - 14:58:52 EST


On Mon 17-09-18 23:10:59, Roman Gushchin wrote:
> The memcg OOM killer is never invoked due to a failed high-order
> allocation, however the MEMCG_OOM event can be raised.
>
> As shown below, it can happen under conditions, which are very
> far from a real OOM: e.g. there is plenty of clean pagecache
> and low memory pressure.
>
> There is no sense in raising an OOM event in such a case,
> as it might confuse a user and lead to wrong and excessive actions.
>
> Let's look at the charging path in try_caharge(). If the memory usage
> is about memory.max, which is absolutely natural for most memory cgroups,
> we try to reclaim some pages. Even if we were able to reclaim
> enough memory for the allocation, the following check can fail due to
> a race with another concurrent allocation:
>
> if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
> goto retry;
>
> For regular pages the following condition will save us from triggering
> the OOM:
>
> if (nr_reclaimed && nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER))
> goto retry;
>
> But for high-order allocation this condition will intentionally fail.
> The reason behind is that we'll likely fall to regular pages anyway,
> so it's ok and even preferred to return ENOMEM.
>
> In this case the idea of raising MEMCG_OOM looks dubious.

I would really appreciate an example of application that would get
confused by consuming this event and an explanation why. I do agree that
the event itself is kinda weird because it doesn't give you any context
for what kind of requests the memcg is OOM. Costly orders are a little
different story than others and users shouldn't care about this because
this is a mere implementation detail.

In other words, do we have any users to actually care about this half
baked event at all? Shouldn't we simply stop emiting it (or make it an
alias of OOM_KILL) rather than making it slightly better but yet kinda
incomplete?

Jeez, we really suck at defining proper interfaces. Things seem so cool
when they are proposed, then those users come and ruin our lives...
--
Michal Hocko
SUSE Labs