Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead

From: Huang\, Ying
Date: Thu Dec 03 2015 - 20:53:44 EST

Next message: Alexandre Belloni: "Re: [PATCH] rtc: fix overflow and incorrect calculation in rtc_time64_to_tm"
Previous message: Andrew Lunn: "Re: SoCFPGA ethernet broken"
In reply to: Mel Gorman: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Next in thread: Michal Hocko: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes:

> On Thu, Dec 03, 2015 at 04:46:53PM +0800, Huang, Ying wrote:
>> Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes:
>>
>> > On Wed, Dec 02, 2015 at 03:15:29PM +0100, Michal Hocko wrote:
>> >> > > I didn't mention this allocation failure because I am not sure it is
>> >> > > really related.
>> >> > >
>> >> >
>> >> > I'm fairly sure it is. The failure is an allocation site that cannot
>> >> > sleep but did not specify __GFP_HIGH.
>> >>
>> >> yeah but this was the case even before your patch. As the caller used
>> >> GFP_ATOMIC then it got __GFP_ATOMIC after your patch so it still
>> >> managed to do ALLOC_HARDER. I would agree if this was an explicit
>> >> GFP_NOWAIT. Unless I am missing something your patch hasn't changed the
>> >> behavior for this particular allocation.
>> >>
>> >
>> > You're right. I think it's this hunk that is the problem.
>> >
>> > @@ -1186,7 +1186,7 @@ static struct request *blk_mq_map_request(struct
>> > request_queue *q,
>> > ctx = blk_mq_get_ctx(q);
>> > hctx = q->mq_ops->map_queue(q, ctx->cpu);
>> > blk_mq_set_alloc_data(&alloc_data, q,
>> > - __GFP_WAIT|GFP_ATOMIC, false, ctx, hctx);
>> > + __GFP_WAIT|__GFP_HIGH, false, ctx, hctx);
>> > rq = __blk_mq_alloc_request(&alloc_data, rw);
>> > ctx = alloc_data.ctx;
>> > hctx = alloc_data.hctx;
>> >
>> > This specific path at this patch is not waking kswapd any more when it
>> > should. A series of allocations there could hit the watermarks and never wake
>> > kswapd and then be followed by an atomic allocation failure that woke kswapd.
>> >
>> > This bug gets fixed later by the commit 71baba4b92dc ("mm, page_alloc:
>> > rename __GFP_WAIT to __GFP_RECLAIM") so it's not a bug in the current
>> > kernel. However, it happens to break bisection and would be caught if each
>> > individual commit was tested.
>> >
>> > Your __GFP_HIGH patch is still fine although not the direct fix for this
>> > specific problem. Commit 71baba4b92dc is.
>> >
>> > Ying, does the page allocation failure messages happen when the whole
>> > series is applied? i.e. is 4.4-rc3 ok?
>>
>> There are allocation errors for 4.4-rc3 too. dmesg is attached.
>>
>
> What is the result of the __GFP_HIGH patch to give it access to
> reserves?

Applied Michal's patch on v4.4-rc3 and tested again, now there is no
page allocation failure.

Best Regards,
Huang, Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Alexandre Belloni: "Re: [PATCH] rtc: fix overflow and incorrect calculation in rtc_time64_to_tm"
Previous message: Andrew Lunn: "Re: SoCFPGA ethernet broken"
In reply to: Mel Gorman: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Next in thread: Michal Hocko: "Re: [lkp] [mm, page_alloc] d0164adc89: -100.0% fsmark.app_overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]