Re: System freezes after OOM

From: Mikulas Patocka
Date: Fri Jul 15 2016 - 17:40:08 EST




On Fri, 15 Jul 2016, David Rientjes wrote:

> On Fri, 15 Jul 2016, Mikulas Patocka wrote:
>
> > > There is no guarantee that _anything_ can return memory to the mempool,
> >
> > You misunderstand mempools if you make such claims.
> >
> > There is in fact guarantee that objects will be returned to mempool. In
> > the past I reviewed device mapper thoroughly to make sure that it can make
> > forward progress even if there is no available memory.
> >
> > I don't know what should I tell you if you keep on repeating the same
> > false claim over and over again. Should I explain mempool oprerations to
> > you in detail? Or will you find it on your own?
> >
>
> If you are talking about patches you're proposing for 4.8 or any guarantee
> of memory freeing that the oom killer/reaper will provide in 4.8, that's
> fine. However, the state of the 4.7 kernel is the same as it was when I
> fixed this issue that timed out hundreds of our machines and is
> contradicted by that evidence. Our machines time out after two hours with
> the oom victim looping forever in mempool_alloc(), so if there was a

And what about the oom reaper? It should have freed all victim's pages
even if the victim is looping in mempool_alloc. Why the oom reaper didn't
free up memory?

> guarantee that elements would be returned in a completely livelocked
> kernel in 4.7 or earlier kernels, that would not have been the case. I

And what kind of targets do you use in device mapper in the configuration
that livelocked? Do you use some custom google-developed drivers?

Please describe the whole stack of block I/O devices when this livelock
happened.

Most device mapper drivers can really make forward progress when they are
out of memory, so I'm interested what kind of configuration do you have.

> frankly don't care about your patch reviewing of dm mempool usage when
> dm_request() livelocked our kernel.

If it livelocked, it is a bug in some underlying block driver, not a bug
in mempool_alloc.

> Feel free to formally propose patches either for 4.7 or 4.8.

Mikulas