Re: Unending loop in __alloc_pages_slowpath following OOM-kill; rfc:patch.

From: Mel Gorman
Date: Tue May 24 2011 - 06:57:55 EST


On Tue, May 24, 2011 at 06:40:31PM +0900, KOSAKI Motohiro wrote:
> (2011/05/24 18:16), Mel Gorman wrote:
> > On Tue, May 24, 2011 at 06:05:59PM +0900, KOSAKI Motohiro wrote:
> >>>>> Why?
> >>>>
> >>>> Otherwise, we don't have good PCP dropping trigger. Big machine might have
> >>>> big pcp cache.
> >>>>
> >>>
> >>> Big machines also have a large cost for sending IPIs.
> >>
> >> Yes. But it's only matter if IPIs are frequently happen.
> >> But, drain_all_pages() is NOT only IPI source. some vmscan function (e.g.
> >> try_to_umap) makes a lot of IPIs.
> >>
> >> Then, it's _relatively_ not costly. I have a question. Do you compare which
> >> operation and drain_all_pages()? IOW, your "costly" mean which scenario suspect?
> >>
> >
> > I am concerned that if the machine gets into trouble and we are failing
> > to reclaim that sending more IPIs is not going to help any. There is no
> > evidence at the moment that sending extra IPIs here will help anything.
>
> In old days, we always call drain_all_pages() if did_some_progress!=0. But
> current kernel only call it when get_page_from_freelist() fail. So,
> wait_iff_congested() may help but no guarantee to help us.
>
> If you still strongly worry about IPI cost, I'm concern to move drain_all_pages()
> to more unfrequently point. but to ignore pcp makes less sense, IMHO.
>

Yes, I'm worried about it because excessive time
spent in drain_all_pages() has come up on the past
http://lkml.org/lkml/2010/8/23/81 . The PCP lists are not being
ignored at the moment. They are drained when direct reclaim makes
forward progress but still fails to allocate a page.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/