Re: [PATCH][5/?] count writeback pages in nr_scanned

From: Nick Piggin
Date: Wed Jan 05 2005 - 22:51:45 EST


Rik van Riel wrote:
On Wed, 5 Jan 2005, Andrew Morton wrote:

Rik van Riel <riel@xxxxxxxxxx> wrote:


The recent OOM kill problem has been happening:
1) with cache pressure on lowmem only, due to a block device write
2) with no block congestion at all
3) with pretty much all pageable lowmme pages in writeback state


You must have a wild number of requests configured in the queue. Is this CFQ?


Yes, it is with CFQ. Around 650MB of lowmem is in writeback
stage, which is over 99% of the active and inactive lowmem
pages...

I've done testing with "all of memory under writeback" before and it went OK. It's certainly a design objective to handle this well. But that testing was before we broke it.


I suspect something might still be broken. It may take a few
days of continuous testing to trigger the bug, though ...


It is possible to be those blk_congestion_wait paths, because
the queue simply won't be congested. So doing io_schedule_timeout
might help.

I wonder if reducing the size of the write queue in CFQ would help
too? IIRC, it only really wants a huge read queue.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/