Re: Still OOM problems with 4.9er/4.10er kernels

From: Michal Hocko
Date: Thu Mar 16 2017 - 05:39:38 EST


On Thu 16-03-17 02:23:18, lkml@xxxxxxxxxxx wrote:
> On Thu, Mar 16, 2017 at 10:08:44AM +0100, Michal Hocko wrote:
> > On Thu 16-03-17 01:47:33, lkml@xxxxxxxxxxx wrote:
> > [...]
> > > While on the topic of understanding allocation stalls, Philip Freeman recently
> > > mailed linux-kernel with a similar report, and in his case there are plenty of
> > > page cache pages. It was also a GFP_HIGHUSER_MOVABLE 0-order allocation.
> >
> > care to point me to the report?
>
> http://lkml.iu.edu/hypermail/linux/kernel/1703.1/06360.html

Thanks. It is gone from my lkml mailbox. Could you CC me (and linux-mm) please?

> >
> > > I'm no MM expert, but it appears a bit broken for such a low-order allocation
> > > to stall on the order of 10 seconds when there's plenty of reclaimable pages,
> > > in addition to mostly unused and abundant swap space on SSD.
> >
> > yes this might indeed signal a problem.
>
> Well maybe I missed something obvious that a better informed eye will catch.

Nothing really obvious. There is indeed a lot of anonymous memory to
swap out. Almost no pages on file LRU lists (active_file:759
inactive_file:749) but 158783 total pagecache pages so we have to have a
lot of pages in the swap cache. I would probably have to see more data
to make a full picture.

--
Michal Hocko
SUSE Labs