Re: Still OOM problems with 4.9er/4.10er kernels
From: Michal Hocko
Date: Tue Feb 28 2017 - 03:27:04 EST
On Tue 28-02-17 14:17:23, Minchan Kim wrote:
> On Mon, Feb 27, 2017 at 10:44:49AM +0100, Michal Hocko wrote:
> > On Mon 27-02-17 18:02:36, Minchan Kim wrote:
> > [...]
> > > >From 9779a1c5d32e2edb64da5cdfcd6f9737b94a247a Mon Sep 17 00:00:00 2001
> > > From: Minchan Kim <minchan@xxxxxxxxxx>
> > > Date: Mon, 27 Feb 2017 17:39:06 +0900
> > > Subject: [PATCH] mm: use up highatomic before OOM kill
> > >
> > > Not-Yet-Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
> > > ---
> > > mm/page_alloc.c | 14 ++++----------
> > > 1 file changed, 4 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 614cd0397ce3..e073cca4969e 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -3549,16 +3549,6 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
> > > *no_progress_loops = 0;
> > > else
> > > (*no_progress_loops)++;
> > > -
> > > - /*
> > > - * Make sure we converge to OOM if we cannot make any progress
> > > - * several times in the row.
> > > - */
> > > - if (*no_progress_loops > MAX_RECLAIM_RETRIES) {
> > > - /* Before OOM, exhaust highatomic_reserve */
> > > - return unreserve_highatomic_pageblock(ac, true);
> > > - }
> > > -
> > > /*
> > > * Keep reclaiming pages while there is a chance this will lead
> > > * somewhere. If none of the target zones can satisfy our allocation
> > > @@ -3821,6 +3811,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > > if (read_mems_allowed_retry(cpuset_mems_cookie))
> > > goto retry_cpuset;
> > >
> > > + /* Before OOM, exhaust highatomic_reserve */
> > > + if (unreserve_highatomic_pageblock(ac, true))
> > > + goto retry;
> > > +
> >
> > OK, this can help for higher order requests when we do not exhaust all
> > the retries and fail on compaction but I fail to see how can this help
> > for order-0 requets which was what happened in this case. I am not
> > saying this is wrong, though.
>
> The should_reclaim_retry can return false although no_progress_loop is less
> than MAX_RECLAIM_RETRIES unless eligible zones has enough reclaimable pages
> by the progress_loop.
Yes, sorry I should have been more clear. I was talking about this
particular case where we had a lot of reclaimable pages (a lot of
anonymous with the swap available).
--
Michal Hocko
SUSE Labs