Re: Bug 12309 - Large I/O operations result in poor interactive performance and high iowait times

From: Minchan Kim
Date: Mon Aug 02 2010 - 19:33:05 EST

Hi Wu,

On Mon, Aug 2, 2010 at 5:12 PM, Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:
>> I've pointed to your two patches in the bug report, so hopefully someone
>> who is seeing the issues can try them out.
> Thanks.
>> I noticed your comment about the no swap situation
>> "#26: Per von Zweigbergk
>> Disabling swap makes the terminal launch much faster while copying;
>> However Firefox and vim hang much more aggressively and frequently
>> during copying.
>> It's interesting to see processes behave differently. Is this
>> reproducible at all?"
>> Recently there have been some other people who have noticed this.
>> Comment #460 From  Søren Holm   2010-07-22 20:33:00   (-) [reply] -------
>> I've tried stress also.
>> I have 2 Gb og memory and 1.5 Gb swap
>> With swap activated stress -d 1 hangs my machine
>> Same does stress -d while swapiness set to 0
>> Widh swap deactivated things runs pretty fine. Of couse apps utilizing
>> syncronous disk-io fight stress for priority.
>> Comment #461 From  Nels Nielson   2010-07-23 16:23:06   (-) [reply] -------
>> I can also confirm this. Disabling swap with swapoff -a solves the problem.
>> I have 8gb of ram and 8gb of swap with a fake raid mirror.
>> Before this I couldn't do backups without the whole system grinding to a halt.
>> Right now I am doing a backup from the drives, watching a movie from the same
>> drives and more. No more iowait times and programs freezing as they are starved
>> from being able to access the drives.
> So swapping is another major cause of responsiveness lags.
> I just tested the heavy swapping case with the patches to remove
> the congestion_wait() and wait_on_page_writeback() stalls on high
> order allocations. The patches work as expected. No single stall shows
> up with the debug patch posted in
> However there are still stalls on get_request_wait():
> - kswapd trying to pageout anonymous pages
> - _any_ process in direct reclaim doing pageout()
> Since 90% pages are dirty anonymous pages, the chances to stall is high.
> kswapd can hardly make smooth progress. The applications end up doing
> direct reclaim by themselves, which also ends up stuck in pageout().
> They are not explicitly stalled in vmscan code, but implicitly in
> get_request_wait() when trying to swapping out the dirty pages.
> It sure hurts responsiveness with so many applications stalled on
> get_request_wait(). But question is, what can we do otherwise? The
> system is running short of memory and cannot keep up freeing enough
> memory anyway. So page allocations have to be throttled somewhere..
> But wait.. What if there are only 50% anonymous pages? In this case
> applications don't necessarily need to sleep in get_request_wait().
> The memory pressure is not really high. The poor man's solution is to
> disable swapping totally, as the bug reporters find to be helpful..

What you mentioned problem is following as.

1. VM pageout many anon page to swap device.
2. Swap device starts to congest
3. When some application swap-in its page, it would be stalled by 2.


1. So many application start to swap-in
2. Swap device starts to congest
3. When VM page out some anon page to swap device, it can be stalled by 2.

Is right?

> One easy fix is to skip swap-out when bdi is congested and priority is
> close to DEF_PRIORITY. However it would be unfair to selectively
> (largely in random) keep some pages and reclaim the others that
> actually have the same age.
> A more complete fix may be to introduce some swap_out LRU list(s).
> Pages in it will be swap out as fast as possible by a dedicated
> kernel thread. And pageout() can freely add pages to it until it
> grows larger than some threshold, eg. 30% reclaimable memory, at which
> point pageout() will stall on the list. The basic idea is to switch
> the random get_request_wait() stalls to some more global wise stalls.
> Does this sound feasible?
Tend to agree prevent random sleep.
But swap_out LRU list is meaningful?
If VM decides to swap out the page, it is a cold page.
If we want to batch I/O of swap pages, IMHO it would be better to put
together swap pages not LRU order but physical block order.

Kind regards,
Minchan Kim
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at