Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.
From: Michal Hocko
Date: Mon Jun 25 2018 - 10:12:52 EST
On Mon 25-06-18 16:04:04, peter enderborg wrote:
> On 06/25/2018 03:07 PM, Michal Hocko wrote:
>
> > On Mon 25-06-18 15:03:40, peter enderborg wrote:
> >> On 06/20/2018 01:55 PM, Michal Hocko wrote:
> >>> On Wed 20-06-18 20:20:38, Tetsuo Handa wrote:
> >>>> Sleeping with oom_lock held can cause AB-BA lockup bug because
> >>>> __alloc_pages_may_oom() does not wait for oom_lock. Since
> >>>> blocking_notifier_call_chain() in out_of_memory() might sleep, sleeping
> >>>> with oom_lock held is currently an unavoidable problem.
> >>> Could you be more specific about the potential deadlock? Sleeping while
> >>> holding oom lock is certainly not nice but I do not see how that would
> >>> result in a deadlock assuming that the sleeping context doesn't sleep on
> >>> the memory allocation obviously.
> >> It is a mutex you are supposed to be able to sleep. It's even exported.
> > What do you mean? oom_lock is certainly not exported for general use. It
> > is not local to oom_killer.c just because it is needed in other _mm_
> > code.
> >
>
> It is in the oom.h file include/linux/oom.h, if it that sensitive it should
> be in mm/ and a documented note about the special rules. It is only used
> in drivers/tty/sysrq.c and that be replaced by a help function in mm that
> do the oom stuff.
Well, there are many things defined in kernel header files and not meant
for wider use. Using random locks is generally discouraged I would say
unless you are sure you know what you are doing. We could do some more
work to hide internals for sure, though.
> >>>> As a preparation for not to sleep with oom_lock held, this patch brings
> >>>> OOM notifier callbacks to outside of OOM killer, with two small behavior
> >>>> changes explained below.
> >>> Can we just eliminate this ugliness and remove it altogether? We do not
> >>> have that many notifiers. Is there anything fundamental that would
> >>> prevent us from moving them to shrinkers instead?
> >> @Hocko Do you remember the lowmemorykiller from android? Some things
> >> might not be the right thing for shrinkers.
> > Just that lmk did it wrong doesn't mean others have to follow.
> >
> If all you have is a hammer, everything looks like a nail. (I donât argument that it was right)
> But if you donât have a way to interact with the memory system we will get attempts like lmk.Â
> Oom notifiers and vmpressure is for this task better than shrinkers.
A lack of feature should be a trigger for a discussion rather than a
quick hack that seems to work for a particular usecase and live out of
tree, then get to staging and hope it will fix itself. Seriously, the
kernel development is not a nail hammering.
--
Michal Hocko
SUSE Labs