On Tue, Mar 09, 2010 at 02:17:13PM +0000, Mel Gorman wrote:[...]On Wed, Mar 10, 2010 at 12:35:13AM +1100, Nick Piggin wrote:On Mon, Mar 08, 2010 at 11:48:21AM +0000, Mel Gorman wrote:Under heavy memory pressure, the page allocator may call congestion_wait()
to wait for IO congestion to clear or a timeout. This is not as sensible
a choice as it first appears. There is no guarantee that BLK_RW_ASYNC is
even congested as the pressure could have been due to a large number of
SYNC reads and the allocator waits for the entire timeout, possibly uselessly.
At the point of congestion_wait(), the allocator is struggling to get the
pages it needs and it should back off. This patch puts the allocator to sleep
on a zone->pressure_wq for either a timeout or until a direct reclaimer or
kswapd brings the zone over the low watermark, whichever happens first.
Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
---
include/linux/mmzone.h | 3 ++
mm/internal.h | 4 +++
mm/mmzone.c | 47 +++++++++++++++++++++++++++++++++++++++++++++
mm/page_alloc.c | 50 +++++++++++++++++++++++++++++++++++++++++++----
mm/vmscan.c | 2 +
5 files changed, 101 insertions(+), 5 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 30fe668..72465c1 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
That is reasonable. I've already dropped the checks in reclaim because as you+{If you were to do this under the zone lock (in your subsequent patch),
+ /* If no process is waiting, nothing to do */
+ if (!waitqueue_active(zone->pressure_wq))
+ return;
+
+ /* Check if the high watermark is ok for order 0 */
+ if (zone_watermark_ok(zone, 0, low_wmark_pages(zone), 0, 0))
+ wake_up_interruptible(zone->pressure_wq);
+}
then it could avoid races. I would suggest doing it all as a single
patch and not doing the pressure checks in reclaim at all.
say, if the free path check is cheap enough, it's also sufficient. Checking
in the reclaim paths as well is redundant.
I'll move the call to check_zone_pressure() within the zone lock to avoid
races.