Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead

From: Mel Gorman
Date: Thu Dec 03 2015 - 05:17:37 EST

Next message: Ulf Hansson: "Re: [PATCH] PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()"
Previous message: Pradeep Goswami (Pradeep Kumar Goswami): "[PATCH]mm:Correctly update number of rotated pages on active list."
In reply to: Huang\, Ying: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Next in thread: Huang\, Ying: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Dec 03, 2015 at 04:46:53PM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On Wed, Dec 02, 2015 at 03:15:29PM +0100, Michal Hocko wrote:
> >> > > I didn't mention this allocation failure because I am not sure it is
> >> > > really related.
> >> > >
> >> >
> >> > I'm fairly sure it is. The failure is an allocation site that cannot
> >> > sleep but did not specify __GFP_HIGH.
> >>
> >> yeah but this was the case even before your patch. As the caller used
> >> GFP_ATOMIC then it got __GFP_ATOMIC after your patch so it still
> >> managed to do ALLOC_HARDER. I would agree if this was an explicit
> >> GFP_NOWAIT. Unless I am missing something your patch hasn't changed the
> >> behavior for this particular allocation.
> >>
> >
> > You're right. I think it's this hunk that is the problem.
> >
> > @@ -1186,7 +1186,7 @@ static struct request *blk_mq_map_request(struct
> > request_queue *q,
> > ctx = blk_mq_get_ctx(q);
> > hctx = q->mq_ops->map_queue(q, ctx->cpu);
> > blk_mq_set_alloc_data(&alloc_data, q,
> > - __GFP_WAIT|GFP_ATOMIC, false, ctx, hctx);
> > + __GFP_WAIT|__GFP_HIGH, false, ctx, hctx);
> > rq = __blk_mq_alloc_request(&alloc_data, rw);
> > ctx = alloc_data.ctx;
> > hctx = alloc_data.hctx;
> >
> > This specific path at this patch is not waking kswapd any more when it
> > should. A series of allocations there could hit the watermarks and never wake
> > kswapd and then be followed by an atomic allocation failure that woke kswapd.
> >
> > This bug gets fixed later by the commit 71baba4b92dc ("mm, page_alloc:
> > rename __GFP_WAIT to __GFP_RECLAIM") so it's not a bug in the current
> > kernel. However, it happens to break bisection and would be caught if each
> > individual commit was tested.
> >
> > Your __GFP_HIGH patch is still fine although not the direct fix for this
> > specific problem. Commit 71baba4b92dc is.
> >
> > Ying, does the page allocation failure messages happen when the whole
> > series is applied? i.e. is 4.4-rc3 ok?
>
> There are allocation errors for 4.4-rc3 too. dmesg is attached.
>

What is the result of the __GFP_HIGH patch to give it access to
reserves?

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ulf Hansson: "Re: [PATCH] PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()"
Previous message: Pradeep Goswami (Pradeep Kumar Goswami): "[PATCH]mm:Correctly update number of rotated pages on active list."
In reply to: Huang\, Ying: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Next in thread: Huang\, Ying: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]