[PATCH] Wait regardless of congestion if too many pages are isolated

From: Minchan Kim
Date: Thu Aug 26 2010 - 13:06:45 EST


Suddenly, many processes could enter into the direct reclaim path
regradless of congestion. backing dev congestion is just one of them.
But current implementation calls congestion_wait if too many pages are isolated.

if congestion_wait returns without calling io_schedule_timeout,
too_many_isolated can schedule_timeout to wait for the system's calm
to preventing OOM killing.

Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx>
---
mm/backing-dev.c | 5 ++---
mm/compaction.c | 6 +++++-
mm/vmscan.c | 6 +++++-
3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6abe860..9431bca 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -756,8 +756,7 @@ EXPORT_SYMBOL(set_bdi_congested);
* @timeout: timeout in jiffies
*
* Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
- * write congestion. If no backing_devs are congested then just wait for the
- * next write to be completed.
+ * write congestion. If no backing_devs are congested then just returns.
*/
long congestion_wait(int sync, long timeout)
{
@@ -776,7 +775,7 @@ long congestion_wait(int sync, long timeout)
if (atomic_read(&nr_bdi_congested[sync]) == 0) {
unnecessary = true;
cond_resched();
- ret = 0;
+ ret = timeout;
} else {
prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
ret = io_schedule_timeout(timeout);
diff --git a/mm/compaction.c b/mm/compaction.c
index 94cce51..7370683 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -253,7 +253,11 @@ static unsigned long isolate_migratepages(struct zone *zone,
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(zone))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ long timeout = HZ/10;
+ if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(timeout);
+ }

if (fatal_signal_pending(current))
return 0;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3109ff7..f5e3e28 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1337,7 +1337,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone,
unsigned long nr_dirty;
while (unlikely(too_many_isolated(zone, file, sc))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ long timeout = HZ/10;
+ if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(timeout);
+ }

/* We are about to die and free our memory. Return now. */
if (fatal_signal_pending(current))
--
1.7.0.5


--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/