Re: [PATCH 1/3] page-allocator: Under memory pressure, wait onpressure to relieve instead of congestion
From: Nick Piggin
Date: Tue Mar 09 2010 - 21:35:43 EST
On Tue, Mar 09, 2010 at 05:35:36PM +0000, Mel Gorman wrote:
> On Wed, Mar 10, 2010 at 02:03:32AM +1100, Nick Piggin wrote:
> > I mean the other way around. If that zone's watermarks are not met, then
> > why shouldn't it be woken up by other zones reaching their watermarks.
> >
>
> Doing it requires moving to a per-node structure or a global queue. I'd rather
> not add hot lines to the node structure (and the associated lookup cost in
> the free path) if I can help it. A global queue would work on smaller machines
> but I'd be worried about thundering herd problems on larger machines. I know
> congestion_wait is already a global queue but IO is a relatively slow event.
> Potentially the wakeups from this queue are a lot faster.
>
> Should I just move to a global queue as a starting point and see what
> problems are caused later?
Yes. This should change allocation behaviours less than your patch does
now in the presence of multiple allocatees stuck in the wait with
different preferred zones.
I would worry about thundering herds as a different problem we already
have. And if wakeups are less frequent, then each one is more likely to
cause a thundering herd anyway.
> > Yep. And it doesn't really solve that race either becuase the zone
> > might subsequently go below the watermark.
> >
>
> True. In theory, the same sort of races currently apply with
> congestion_wait() but that's just an excuse. There is a strong
> possibility we could behave better with respect to watermarks.
We can probably avoid all races where the process sleeps too long
(ie. misses wakeups). Waking up too early and finding pages already
allocated is harder and probably can't really be solved without all
allocatees checking the waitqueue before taking pages.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/